How to extract acronyms from source text? Thread poster: Erik Freitag
| Erik Freitag Germany Local time: 11:08 Member (2006) Dutch to German + ...
Dear colleagues, This may not be the best forum for my question, but here goes: I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals. Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context. If anyone know a way... See more Dear colleagues, This may not be the best forum for my question, but here goes: I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals. Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context. If anyone know a way to achieve this with SDL Trados Studio 2017, TermExtract, or third party software, I'd be grateful for a hint. Many thanks in advance, kind regards, Erik ▲ Collapse | | | Adam Łobatiuk Poland Local time: 11:08 Member (2009) English to Polish + ...
For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks. In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.
[Edited at 201... See more For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks. In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.
[Edited at 2018-02-22 20:00 GMT] ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » How to extract acronyms from source text? CafeTran Espresso | You've never met a CAT tool this clever!
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free
Buy now! » |
| Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |