Hi, Johan,
Regarding ''Translators-l Digest, Vol 182, Issue 15'' and not limited to this target document, I would like to inform you a tendency of those English texts I find it a bit inconvenient as a translator. I wonder, is there any better channel to segment the translation original into smaller chunk by any tool, before announcing translation request?
I find Translation originals tend to be in too big chunks in past two months, regardless of which WMF team issues. Noting it is the start of a new fiscal year/ projects/ final Movement Strategy phase, more requests come in to internationalize documents on mailing lists/announcements. Hats off to those who have completed those requested tasks!
To be precise. Working on the Translation tool, the segmentation of the translation original needs not to be based on context, but rather on word count: * less than five sentences each please. Let me refer to Tech News, and aside from the total length of each issue/number being very crisp, it attracts me to jump in and work on translation as it applies ideal segmentation; as noted above, five sentences at the max (actually smaller than that mainly). Seriously speaking, I am proud Tech News has welcomed more translators in more languages, in twenty as September, which I remember to be around eleven a year ago.
Translation corpus. Practically, we are loosing chances to expand the translation corpus as well, as I understand longer the translation original we are pushing down the ratio a particular expression be picked up into the corpus: * A sentence is not recorded as a resource to be applied on other occasions data-wise, in multiple languages. Thus in my case technical contexts like that for Check-users more tedious to maneuvre. I feel numb that I need to roll up/down to find my translation for the same expression buried in long sentences, I curse I can’t rely on suggestions the tool has offered me on the right-most pane.
Maybe I have been indulged with that corpus function, but working on the Tool makes me feel like to do more, as I find a relief that translation memory saves me from my very short temporary memory. Or I need not be embarrassed for my error making variants to fixed WM term in en-ja by a kind ping. I’ve even cheated by showing those corpus output in both zh and ja as we share terminology.
However, if not to channel any documents to be translated through some automatic segmentation tool, I hate to see some trans tech do the manual labor cutting down so many documents into smaller chunks.
I could edit the en original, but doing so, I had alarmed other translators who needed to go back for updates; I remember some cases at updates to en original the Fuzzy had re-segmented the texts, I patched up the en text to its old paragraph that triggered extra update needs, when a generous tech person rescued the document and patched up chunks so that translators were saved time/energy.
The dilemma to involve fresh forces and increase proofreading task shall be much smaller, with good corpora that Translation tool is designed for. Thus we could even share a single translation task among several hands for speedy outputs. Isn’t it how segmentation contribute? Of course, we can download the trans original as a whole page, do the task, and put back the result, which is an alternative but not necessarily increase the number of translators.
We could, I hope, involve more translators to very important documents, if original documents were supplied in smaller chunks, thus tied with corpus better, we would be supplied more translation suggestion to more chunks, so that we train ourselves as we work on a document. Less chances to look up dictionary/ our memo outside the tool, or stop making variants to WM terminology and save proofreaders.
Translation tool, it is a great tool for an inclusive group, as corpus show the past translation samples on the right most pane, so that any fresh starter to WM translation finds a “case specific examples and WM terminology”. That is why I have invited my friends to share the translation task.
We are facing a very busy season for internationalizing WM documents, and more hands shall be available to work on those tasks, as we have more people watching computer screens who care their fitness. Extra long segmentation is not very ''fit'' for quality output nor the tool is designed for.
Please encourage translators with supplying corpus to rely on, smaller chunks to work on so that we are energized to contributing to the community as ever.
Cheers, gratitude you reading to the bottom, --User:Omotecho
2020/09/29 21:01、translators-l-request@lists.wikimedia.orgのメール:
Translation help: Help pages for new checkuser tool
Hi Omotecho,
Thanks for the reminder. I sure do prefer shorter chunks myself while translating, too. I can't do anything about this particular page without creating problems for the already existing translations, but your feedback is heard and noted and I'll pass it on as a reminder to others, too.
//Johan Jönsson --
On Fri, Oct 2, 2020 at 7:47 AM mw Omotecho omotechomw@gmail.com wrote:
Hi, Johan,
Regarding ''Translators-l Digest, Vol 182, Issue 15'' and not limited to this target document, I would like to inform you a tendency of those English texts I find it a bit inconvenient as a translator. I wonder, is there any better channel to segment the translation original into smaller chunk by any tool, before announcing translation request?
I find Translation originals tend to be in too big chunks in past two months, regardless of which WMF team issues. Noting it is the start of a new fiscal year/ projects/ final Movement Strategy phase, more requests come in to internationalize documents on mailing lists/announcements. Hats off to those who have completed those requested tasks!
To be precise. Working on the Translation tool, the segmentation of the translation original needs not to be based on context, but rather on word count:
- less than five sentences each please.
Let me refer to Tech News, and aside from the total length of each issue/number being very crisp, it attracts me to jump in and work on translation as it applies ideal segmentation; as noted above, five sentences at the max (actually smaller than that mainly). Seriously speaking, I am proud Tech News has welcomed more translators in more languages, in twenty as September, which I remember to be around eleven a year ago.
Translation corpus. Practically, we are loosing chances to expand the translation corpus as well, as I understand longer the translation original we are pushing down the ratio a particular expression be picked up into the corpus:
- A sentence is not recorded as a resource to be applied on other
occasions data-wise, in multiple languages. Thus in my case technical contexts like that for Check-users more tedious to maneuvre. I feel numb that I need to roll up/down to find my translation for the same expression buried in long sentences, I curse I can’t rely on suggestions the tool has offered me on the right-most pane.
Maybe I have been indulged with that corpus function, but working on the Tool makes me feel like to do more, as I find a relief that translation memory saves me from my very short temporary memory. Or I need not be embarrassed for my error making variants to fixed WM term in en-ja by a kind ping. I’ve even cheated by showing those corpus output in both zh and ja as we share terminology.
However, if not to channel any documents to be translated through some automatic segmentation tool, I hate to see some trans tech do the manual labor cutting down so many documents into smaller chunks.
I could edit the en original, but doing so, I had alarmed other translators who needed to go back for updates; I remember some cases at updates to en original the Fuzzy had re-segmented the texts, I patched up the en text to its old paragraph that triggered extra update needs, when a generous tech person rescued the document and patched up chunks so that translators were saved time/energy.
The dilemma to involve fresh forces and increase proofreading task shall be much smaller, with good corpora that Translation tool is designed for. Thus we could even share a single translation task among several hands for speedy outputs. Isn’t it how segmentation contribute? Of course, we can download the trans original as a whole page, do the task, and put back the result, which is an alternative but not necessarily increase the number of translators.
We could, I hope, involve more translators to very important documents, if original documents were supplied in smaller chunks, thus tied with corpus better, we would be supplied more translation suggestion to more chunks, so that we train ourselves as we work on a document. Less chances to look up dictionary/ our memo outside the tool, or stop making variants to WM terminology and save proofreaders.
Translation tool, it is a great tool for an inclusive group, as corpus show the past translation samples on the right most pane, so that any fresh starter to WM translation finds a “case specific examples and WM terminology”. That is why I have invited my friends to share the translation task.
We are facing a very busy season for internationalizing WM documents, and more hands shall be available to work on those tasks, as we have more people watching computer screens who care their fitness. Extra long segmentation is not very ''fit'' for quality output nor the tool is designed for.
Please encourage translators with supplying corpus to rely on, smaller chunks to work on so that we are energized to contributing to the community as ever.
Cheers, gratitude you reading to the bottom, --User:Omotecho
2020/09/29 21:01、translators-l-request@lists.wikimedia.orgのメール:
Translation help: Help pages for new checkuser tool
Translators-l mailing list Translators-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/translators-l
translators-l@lists.wikimedia.org