translate‘s Linguistic Capabilities

The objective of a machine translation system is to translate as well and as quickly as possible. Translation quality is dependent on a number of parameters, the most important being:

  • suitability of the source text for machine translation: sentence length and complexity, ambiguity
  • availability of the necessary vocabulary
  • parsing and translation capability of the system
  • stylistic standard expected of the translation.

The translate system has been extensively tested, both systematically for handling linguistic constructions and translation problems, and by use on sample texts, mostly taken from the field of computing.

The speed at which translate translates obviously depends on the available hardware and software configuration, but is also dependent on the complexity of the sentences to be translated.

Some examples will demonstrate what translate is capable of. They have been translated from English into German as printed, and the results can be readily reproduced using the corresponding settings.

Homonyms

Homonyms are words with several different meanings. These different meanings can have completely different morphosyntactic properties. A word can be more than one part of speech, for example, such as walk, which is a verb and a noun. Homonyms are a common phenomenon in English:

He watches her watches.
She books some books.
In order to eat they order a meal.
They can can the fish in a can.

are sentences which demonstrate two (in the last sentence, three) meanings of words. These are stored in the dictionary together with criteria for when to use which meaning. translate provides the following German translations:

Er beobachtet ihre Uhren.
Sie bucht einige Bücher.
Um zu essen, bestellen sie eine Mahlzeit.
Sie können den Fisch in einer Dose eindosen.

Semantic Types

Ambiguity of words can often be described by assigning them different semantic types (generic terms), and describing the semantic types of their slots (complements). A German word like Bank can be described as an institution and as furniture. Only in the context of a sentence is it possible to decide the specific meaning of Bank and hence the required translation.

With the German verb erheben the different semantic types of the slots result in different translations, as the following examples show:

Der Staatsanwalt erhob Anklage gegen den Mörder.

The public prosecutor brought charges against the murderer.

Der Gemeinderat hat eine Gebühr auf Abfall erhoben.

The district council has levied charges on waste.

Die Tenöre erhoben die Stimme.

The tenors raised the voice.

Er erhob sich.

He rose.

Compound Words

Compound words are groups of words the meaning of which cannot be derived from their component parts, and which therefore must be translated differently than the individual words they contain. They must be included in the dictionary if they are to be translated correctly. Compound words can contain all parts of speech; their component parts are often inflected (i.e. they change their form) and can often occur separated from each other. It is not easy to enter the properties of compound words correctly, and it was decided that users of Translate should not define all the different types of compound words.

The dictionary entry for the German verb stellen includes the information that when used together with Verfügung it should be translated as provide (rather than using the normal translation).

Ich werde die Vase zur Lampe stellen.

I will place the vase next to the lamp.

Ich werde die Vase zur Verfügung stellen.

I will provide the vase.

Some compound words are mirrored in German and English:

Er verlor sein Gesicht.

He lost his face.

There is then no need for special definitions in the dictionary.

The most common compound words in English are noun-noun compounds, nouns with prepositional attributes and adjective-noun groups such as:

railway station

Bahnhof

table of contents

Inhaltsverzeichnis

environmental pollution

Umweltverschmutzung

Passive Constructions

English passive constructions differ somewhat from German ones. translate takes account of this difference when it translates.

English passive constructions are created using to be, but German passive constructions use to become (werden). In German, creating a passive sentence makes only the accusative object the subject, but in English the indirect object can be made passive in addition to the direct object. When the indirect object is the subject of an English passive sentence, the roles must be switched in the translated version. The following active sentences result in the same translation:

Alice has given John the book.
Alice has given the book to John.

Alice hat John das Buch gegeben.

When the direct object is made passive, this changes to:

The book was given to John by Alice.

Das Buch wurde John von Alice gegeben.

When the indirect object is made passive, the result is:

John was given the book by Alice.

John wurde das Buch von Alice gegeben.

There are two different forms of the passive in German:

Expressing a process:

Das Programm wird geladen.

Expressing a completed state:

Das Programm ist geladen.

When translated into English, both sentences are rendered as:

The program is loaded.

The problem here is that when translating from English into German it is often difficult to tell which option is correct.

Coordinating Conjunctions

Translate uses a complex algorithm to analyze coordinating conjunctions, which makes it possible to correctly translate both simple conjunctions (like and or or) and conjunctional phrases (like both … and):

John hears and Mary sees the car.

John hört, und Mary sieht das Auto.

Both John and Mary see the car.

Sowohl John als auch Mary sehen das Auto.

John Cleverman wants to buy a new car but tries to avoid having to pay too much for it.

John Cleverman will ein neues Auto kaufen, aber versucht, es zu vermeiden, zu viel dafür bezahlen zu müssen.

Implicit Subjects

Translate is capable of recognizing the implicit subjects of incomplete verb forms. The English verb want is an example. The system recognizes that in the following examples:

John wants to leave.

John will gehen.

the subject of leave is linked to the subject of wants, while in

John wants Frank to leave.

John will, dass Frank geht.

the subject of leave is the object of wants (Frank).

Of course, there are exceptions to this general rule which do not present any problems.

Both in

John promised to leave.

John versprach zu gehen.

and in

John promised Frank to leave.

John versprach Frank zu gehen.

the subject of the infinitive clause is linked to the subject of the main clause.

Interrogative Sentences

As you would expect, translate also handles interrogative sentences correctly. Here are some examples:

When did you arrive?

Wann kamen Sie an?

When will she leave us?

Wann verlässt sie uns?

Who did she try to find?

Wen versuchte sie zu finden?

Can you explain to me the way to the station?

Können Sie mir den Weg zum Bahnhof erklären?

Who did they say John wanted to find?

Von wem sagten sie, dass John ihn finden wollte?

Incomplete Sentences

translate can also be used to translate individual words, and groups of words that make grammatical sense, if they are

  • terminated by an end-of-sentence mark
  • terminated by a new-line character
  • selected, and translated using translate – Translate Sentence.

tree

Baum

yellow flowers

gelbe Blumen

The building beside the station.

Das Gebäude neben dem Bahnhof.

The man watching the car.

Der Mann, der das Auto beobachtet.

Please note that it can be more difficult to resolve ambiguity in sentence fragments than in complete sentences, so it is advisable to take particular care over checking such translations.

Punctuation

Punctuation is very important for translate.

End-of-sentence characters like periods (.), exclamation marks (!), and question marks (?) are used to break down a text into individual sentences. Note that the period can have a number of different functions:

  1. end of sentence
  2. to denote abbreviations
  3. decimal period (English)
  4. as a separator when writing large numbers in digits (German)
  5. to denote ordinal numbers (German)

Separator characters like commas (,), semicolons (;), dashes (-), dashes in lists, and colons (:) are used to separate sentence parts from each other. When analyzing the source-language text, translate uses a certain license with regard to such separators. However, it should be noted that the presence or absence of separators can involve a decisive change in meaning.

Er befiehlt ihm zu helfen.

He orders him to help.

Unfortunately, it is not possible to ensure that translate always puts commas in the right place, so you should always check translations carefully for correct commas.

Parentheses are symbols that occur in pairs, such as round brackets, square brackets, braces, dashes and quotation marks, that can enclose words or groups of words.

John wears the (blue) shirt (which he bought yesterday).

John trägt das (blaue) Hemd (das er gestern kaufte).

John, after he had left the office, went to the bank.

John ging, nachdem er das Büro verlassen hatte, zur Bank.

I like the book I bought yesterday.

Das Buch, das ich gestern kaufte, gefällt mir.

Ambiguity

The ambiguity of natural language is one of the major problems faced by machine translation. translate is equipped with a number of strategies for dealing with language ambiguity, including:

  • coding the different meanings of words and expressions with regard to part of speech, and syntactic and semantic properties
  • assessing how probable different analyses of a group of words are. The variant with the highest rating is then output as the translation. The rating is based on general grammatical rules. This means that translations that are not complete sentences are excluded when a more complete analysis exists.

The English word like will be used as an example to explain the principle. There are two translations for like, gefallen (verb) and wie (conjunction). In the following simple English sentence this gives rise to two possibilities:

I like it.

Es gefällt mir.
Ich wie es. (wrong !)

translate recognizes that the second variant is unlikely to be correct, and does not present it as a proposed translation.

If a sentence can have several meanings, and there are therefore several correct translations, one variant is always the simplest one in grammatical terms. The program opts for this variant. The translation of the following ambiguous question corresponds to the less probable meaning:

Which horse do you want to win?

Welches Pferd wollen Sie gewinnen?

Multiple Translations

There are often several different ways of translating a sentence. Translate assesses the different translations and generally outputs the one with the highest score. It’s possible that this is not the translation you want, so Translate now allows you to create a number of different translations and have them displayed. Below are some examples, based on English-German translation:

They complained to the guide that they could not see.

Sie beklagten sich beim Führer darüber, dass sie nicht sehen konnten.
Sie beschwerten sich beim Führer, den sie nicht sehen konnten.
Sie klagten zum Führer darüber, dass sie nicht sehen konnten.

She saw John leaving.

Sie sah John gehen.
Sie sah John, als sie ging.

Die Führung wählt das Team.

The leadership chooses the team.
The team chooses the leadership.

Influencing Translation Results

translate offers several translation options, and the way they are set can have a major impact on the translation. A key benefit of translate is the fact that it also allows words to be translated differently from normal usage in the context of specific subject areas. Consider the English word enter, for example. It is generally used as the equivalent of the German word betreten, but in connection with computers it must be translated as eingeben.

  • Subject areas
  • Impersonal imperative
  • Translate impersonal request with imperative
  • Translate Sie as you instead of they
  • Translate you as Sie instead of du
  • Recognize what pronouns refer to Line break as end of sentence
  • American English / British English
  • Provide multiple translations
  • Time limit per sentence

Spelling

translate makes every effort to follow the rules for German and English spelling and punctuation when producing its translations, including the correct use of upper-case and lower-case letters in German. However, some errors still occur, partly due to specific contexts, partly due to coding errors which were not detected at an early enough stage to allow corrective measures to be taken. Generally, in German translations the current German spelling rules are used, rather than the reformed rules.

With regard to the source-language text, the aim has been to be liberal in the use of spelling rules, assuming they do not lead to ambiguity. In particular, this concerns the rules governing use of upper case / lower case in German, as well as writing words as one word or two. With respect to the rules for putting commas, an option has been introduced to select between the old style and the new (“liberal”) style.

Im allgemeinen schreibt der Chef richtig.
Im Allgemeinen schreibt der Chef richtig.

The boss generally writes correctly.

Der Brief ist verlorengegangen.
Der Brief ist verloren gegangen.

The letter has been lost.

Er weiß, daß die Maße stimmen.
Er weiß, dass die Maße stimmen.

He knows that the measures are correct.