Criteria for Counting Words

CRITERIA FOR COUNTING WORDS IN TEXTS TO BE USED IN SHORTHAND SPEED DICTATIONS
(for the Portuguese language)
Organized by Prof. Waldir Cury (from Brazil)
With especial contributions by Prof. Paulo Xavier, Director of Taquibras
GENERAL RULE: For the purpose of counting in shorthand dictations, every unit of sound and meaning (phoneme) that by itself may constitute an enunciation should be considered as a word, no matter the grammatical class to which it might belong and the number of syllables it might have.   Thus, both the noun “inconstitucionalidade” as well as the simple article “o” should be counted as a word.
NOTE: Characters such as asterisks and punctuation signs, which are not pronounced, should not be included in the count.  The character or punctuation sign that is pronounced by the person who dictates it and which is recorded by the person who takes it down should be counted as a word.  For instance, in the case of 158,8% (cento e cinqüenta e oito vírgula oito por cento) the comma is pronounced by him/her who dictates and is written down by the stenographer.  The comma should be, therefore, counted as a word.
IMPORTANT DETAILS FOR THE COUNT:
Ø      Any single letter of the alphabet that is pronounced and has, in the text, a meaning, should be counted as a word.  Ex.: Eis aí o “x” da questão.
Ø      The “abbreviations”, even when reduced to only one letter, should be counted as a word.  Ex.: 2 h (duas horas)  =  two words.
Ø      The punctuation marks which are not pronounced by the person who dictates them should not be counted.
Ø      The cardinal numbers should be counted as if they were written in full.  So, the number “2493” (dois mil quatrocentos e noventa e três) should be counted as seven words.  “21” (vinte e um) will be counted as three words. “0,5” (zero vírgula cinco) – three words.
Ø      The ordinal numbers should be counted as if they were written in full.  So, “21º andar” (vigésimo primeiro andar) should be counted as three words.
Ø      The fractionary numbers should be counted as if they were written in full.  So, “3/4” (três quartos) should be counted as two words.  “1/24” (um e vinte e quartro avos) should be counted as six words.
Ø      Roman numerals.  Roman numerals should be counted as if they were written in full.  Therefore, “capítulo XXXII (capítulo trinta e dois), should be counted as four words.  “D. João VI” (Dom João Sexto), should be counted as three words.
Ø      Mathematical signs should be counted as if they were written in full.  So, “44%” (quarenta e quarto por cento), should be counted as five words.  “25+3=28” (vinte e cinco mais três igual a vinte e oito), should be counted as  ten words.
Ø      In the case of the expression 23/2 (vinte e três barra dois), in which the word “barra” is read by the person who dictates and is taken down by the person who writes it down, the “barra” (slash) is counted as a word.  Therefore,  23/2 (vinte e três barra dois), should be counted as five words.
Ø      In the case of the expression R$50.000,00 (cinqüenta mil reais) it should be counted as three words, because it is pronounced as such.
Ø      Compound words (with or without a prefix), joined by a hyphen, although forming a unit, should be counted according to the number of words that enter into its formation, because each word is taken down independently.  Thus, the compound word “grão-de-bico” should be counted as three words, and the compound word “peixe-espada” as two words.
Ø      Foreign compound words should follow the same criterion.  Compound words, joined by hyphen, although forming a unit, should be counted according to the number of words that enter into its formation.  E.g.: “know-how” (two words), “habeas-data” (two words), “hors-concours” (two words).
Ø      Compound words without hyphen should be counted as “one single word”.  Examples: “minissaia”, “minisubmarino”, “superocupado”, “superinterglacial”.
Ø      Words in which there is an apocope of the vowel “e” in compound words joined by the preposition “de”, should be counted in the following way: “estrela-d’alva” (two words), “mãe-d’água” (two words), “olho-d’água” (two words), “pau-d’arco” (two words).
ABBREVIATIONS AND ACRONYMS
Ø     When the abbreviation forms a pronounceable word, it should be counted as a word.  E.g.: ONU (Organização das Nações Unidas), PIS (Programa de Integração Social), SIDA (Síndrome de Imunodeficiência Adquirida), etc.
Ø     The acronyms (words formed from the first letters or syllables of other words), which form a pronounceable word, should be counted as a word.  Ex.: Sudam (Superintendência do Desenvolvimento da Amazônia).  UNESCO (United Nations Educational Scientific, and Cultural Organization).
Ø     When there is a joining element between the letters of an abbreviation, this element should be counted as one word.  For instance, the abbreviation “PCdoB”, should be counted as four words, since this is exactly the amount of independent pronounceable words/syllables/letters (pê-cê-dô-bê).
Ø     A “siglema” (an abbreviation that forms a word) should be counted as a word.  Ex.: Petrobras, Unesco, Taquibras.
Ø     A “siglóide” (an abbreviation that does not form a word, or does not have the character of a word) should be counted according to the number of letters of which it is composed.  E.g.: B.N.D.E.S (counted as five words).  E.U.A. (counted as three words.).
Ø      As for texts containing numbers with more than one unit, it is advisable to rewrite the text and write the numbers in full, and only then make the  count of the words.  The reason for this procedure is the fact that if that is not done so, the count can fall exactly in the middle of the numbers and thus making it difficult to count, if the numbers are not written in full.  For example, in 158,8% (cento e cinqüenta e oito vírgula oito por cento) there are nine words to be counted.  If the count falls exactly after the number “cinqüenta”, this can make the count difficult, if not impossible (as in the case of falling between the sign “%”, per cent).
Ø      I n the verbal forms with enclitic and proclitic pronouns, as well as with tmesis, whose elements are either linked or not by a hyphen  – each element should be considered as a word.  Therefore:
Quero que me diga. (four words)
Diga-me. (two words)
Possuí-las. (two words)
Amá-lo-ia. (three words)
Oferecê-la-ia. (three words)
Ø      A word which is not visible but which is implicit and as so pronounced by the person who dictates it and recorded by the person who takes it down, should be considered as a word in the count.  E.g.: In “9h30” (nove horas e trinta minutos) five words should be counted, including the word “minutos” and the conjunction “e”, which were not originally written down.
FINAL REMARK: In order to avoid mistakes it is highly advisable to write the whole text in full before making the count of the words.  E.g.: 158,8% should be written in full: cento e cinqüenta e oito virgula oito por cento.
(by Waldir
for group greggshorthand)

Previous post:
Next post:
9 comments Add yours
  1. Gregg and Pittman both use 1.4, although Gregg started with "a word is a word". I've seen 1.4 in other systems, too.

    Scientific and legal would probably have more syllables per word than general language. Some of those words would require every syllable. ("Nitrite" and "nitrate" are different chemicals.) Some of those words have omit-able syllables and are easier to write than two single-syllable words.

    Then you have routine phrases, where an experienced legal reporter can represent five or ten words with one or two shorthand characters, but an equally experienced political reporter would have to write more characters. Some competitions offered separate prizes for legal and literary passages.

    Given that the perfect counting system would have to take in to account the field and the related experience of the reporter, I'll accept 1.4 as good enough when recording times.

  2. You know, there is not a perfect counting system. We have seen a few arguments about the ideal counting in Portuguese being the use of syllabic counting instead of real words. The "standard word" seems like a third way to do the same thing; it seems to work in English, using 1.4 as basis. But where can we found the ideal number in Portuguese? I don't know how Gregg has found this 1.4.

  3. Gregg in Spanish uses 2 syllables/word as the standard. Given the similarity between Spanish and Portuguese, I don't see a reason for not using the same number.

  4. Hi Macbud! Concerning the complexity of the discussion about the count of words in shorthand, let me “add fuel to the flames”…(laughs)

    2-1 Comparability of shorthand performance in different languages
    Evaluating a performance in shorthand is not handled identically worldwide. In some countries, text quantity counted in words, in others in syllables, in others (east asiatic languages!) in signs. Transcription quality is evaluated in percentages of correct words or syllables or indicating penalty points or by awarding grades.

    Take a look at the entire article:
    http://www.intersteno.it/uploads/enews35SciCom.pdf

  5. Very interesting article. I hadn't realized how extreme the difference in information vs syllables was. I like the idea of counting "ideas", but I can see even more complications when they compare the complexity of the translation. "Periwinkle" has more syllables than "blue", and it's hard to tell whether it's an important difference.

  6. That is certainly one of the most complex problem concerning shorthand: count of words. And it always provokes inflamed discussions. I would say, that is “an unsolvable problem”. There have been some attempts to solve it, but these attempts are only palliative, I mean, they are designed to make a difficult situation seem better without actually solving the cause of the problem. If we take that into account, we can say tha Gregg’s “standard word of 1.4 syllable” for the text in long-hand is a palliative, because it does’t take into consideration the “shorthand brief forms” and other intricacies of each shorthand system. It “sweetens the pill”!
    In my view, the actual cause of the problem is exactly that: the dealing with two completely different systems of writing: long-hand and shorthand, this last one a strictly phonetic system of writing.
    When one takes into consideration only the long-hand system (the text) for the purpose of count, that will be a disaster in terms of count, because it will never be fair, if we consider the relation between the long-hand and the shorthand system of writing.
    As far as I know, there have already been 5 attempts to solve the problem, all failed, all unfair, because they don’t include in them the brief forms and the peculiarities and intricacies of each shorthand system. They use to count the brief forms as being words or syllables – and they frequently are not! There are brief forms of only one stenographic sign for a whole sentence!

    The five attempts (unsuccessful, in my opinion) are:
    1) Count of Words.
    2) Gregg “standard word” of 1.4 syllable.
    3) Gregg in Spanish – 2 syllables/word as the standard.
    4) Count of Syllables.
    5) Count of signs (east asiatic languages)

    So, my opinion is the following: because of the significant differences between the two systems of writing (long-hand and shorthand), it will never be possible to create a fair criterium for count. All we can obtain are bad “palliatives”!

Leave a Reply