Boy would an electronic dictionary be useful

Is anyone working on a searchable dictionary of shorthand forms. I had just tried to OCR (convert to searchable text) the 5000 word pdf on Gregg Angelfishy but the software was giving me problems.
I wonder, does anyone know how one would go about scanning a large amount of outlines into a database file or document without a ridiculous amount of busy work? What would be the most appropriate type of document type that would best store this information? What kind of database file could store this kind of information?
An electronic dictionary of outlines would provide a lasting benefit to the future of Gregg Shorthand. Even 5000 forms would be a big help.
Is anyone working on a searchable database of outlines? Angelfishy seems to have had success putting in searchable indexes of the brief forms, but I don’t know how much he is willing to up that number to 5000 or 17000.

(by michael_lisitsa
for everyone)

 

25 comments Add yours
  1. I thought of this, as a programming exercise.

    My grand plan was,… (too much for me!)

    Database fields:

    Word in English
    Gregg spelling
    brief form (y/n)
    Link to image with outline; each image with 10-50 outlines.

    I had thought of one image per word, then you could enter text and see the images strung together as a translation, but I'm not volunteering. To look the best, all images would need to be the same height, with the baseline at the same spot in all, and all drawn at the same size.

    Phrases would work as well; enter a group of words under English spelling.. The "translation" program would need to look ahead to see if the word is part of a phrase.

    I'd use a gif or png to store the images.

    MySQL could store the image or a code (e.g. A15), and the program querying the database would use the code to generate a link.

    I think ImageMagic can, with programming, take "a 30×30 square, starting at this point", so you would put "plate A Row 16 column 4" in the database. (ImageMagic can do a lot, but I don't know how to use it.) There are other image manip programs that can do the same thing.

    And talking about grand,… add the above 3 fields once for each edition.

    I better go practise writing with a pen.

    Cheers!

  2. Well I never thought about making a converter that uses a database of outlines to convert text into outlines. That would be extremely cool, although that is obviously a second step and if phrasing is to be properly implemented than it would be very effortsome to get this to work.   But it's the second step, and the first step is to start a dictionary file. What we need is a utility like this. STEP 1- In a PDF file, I use the selection tool to enclose an outline. STEP 2- In response to this, a dialogue box pops up asking me to input the name to attach to that image. STEP 3- I press ok to input this entry into a database, and move onto the next outline.   With 5000 outlines or 17000 outlines this would take a while but not more than a night or two. Any ideas on how I could simplify the manual labor to the above steps?

  3. I like that idea; the textbooks are already scanned in at a standard size. Andrew? did a good job; they are nice and straight.

    It doesn't have to be a pdf file; it's very easy to convert from one file type to another, and it might be easier to use a different one.

    If you have a screen-computer (template? that doesn't seem the right word), you could get a list of the words and a blank area to write each word in, then press ok.

    I don't know if either of those ideas would be easier than the "plate" idea.

    Not for today.

    Today I advertised my very own website design business, but I don't have a business site. I was using that to procrastinate. Contact me if you're interested; no obligation. The price was hard to set; too low and they don't believe you know what you're doing, too high and they go elsewhere.

    Cheers!

  4. I think the trouble with automated shorthand recognition is that forms often represent more than one word. So you'd have to approach it more like speech to text instead of OCR.

    Sentence of shorthand outlines.
    example [ A K RA SH over NT ]
    Like images with possible outlines.
    [A-K/G RA SHoverNT]
    Compile list of possible phrases and words in sentence.
    [ack/'I can'/'I good'/'I go Ra(the god)/write(verb)/Ray(pronoun)/ray(noun) 'ship hand'/'short hand'/shan't]
    Select the most statistically plausible sentence.
    [ I can write shorthand ]

    I think it would be very tricky. Speech recognition has the advantage of measuring pauses which help to separate clauses. I think I can be done, but computers may well read minds first. It's probably simpler.

  5. The software would have to have grammatical analysis to determine which word among homonyms is the right one. Is "t" "it" or "at" in this sentence? Is the "n" "in" or "not"? It would have to be very advanced for sentences like "That is not/in the cupboard", relating it to previous information to select the correct one.

    That's why in machine shorthand, we have to stroke every homonym differently, since it has to translate real-time and doesn't have the intelligence of a skilled transcriber.

  6. I have an idea for the "shorthand" half of your online dictionary.  I just saw an item in the Dec. '07 Costco store magazine.  It's called the Wacom Bamboo Fun Graphics tablet.  It says "Simply plug it it in and touch the pen tip to the tablet to touch up digital photos, raw by hand, create artwork and paintings, and even write in your own handwriting."   It says it's available on costco.com.  No idea how much $, though.   I imagine it would be a long, involved process to prepare an online dictionary no matter what tools one used! 

  7. This task is not forgotten, in fact, today and some of yesterday, I have managed to already get 1000 of the 5000 words in the 5000 most frequent outlines book, put into a picture database.

    I've uploaded the files with media watch.
    The first one is for Ms ACCESS 2007
    The second for Ms ACCESS XP or 2003

    http://www.mediafire.com/?4og0tfdr4ll
    http://www.mediafire.com/?c7l13tmb3zm

    I'd like some feedback, maybe a better way of putting images into an Access Database to make them a bit smaller. I know that a lot of the images have bits off other words that were caught in my selection, but unless I wanted to add like an extra 20 seconds to every single data entry, then I will have to live without that luxury.

    Its best that I get feedback now when I've only done 1/5 of the database so then if there is a substantially better way than I can give up this method and go on to the other

    Now if you want to read on, here is how I have gone about it.

    Basically what I have made is a simple access database, with the word and its shorthand picture. The files are already a bit big (7.8MB) because 1000 bitmaps take up a lot of room.
    I put in two forms, one a list view with the image being displayed in a box, and the other a list and preview.

    I reduced the amount of clicks per data entry a lot. I still have to type out each word, frankly OCR (character recognition) will be too much of a bother, but to copy the image, I have download 101Clip which is a multi clipboard. Then as I take a snapshot of each outline with foxit reader (a snapshot copies to clipboard any selected area of image). This all gets sent into the multiclipboard, which has a feature to paste the pictures to cells going downwards. So there really isn't too much problem with data entry. It will take me probably another 3-4 days to finish off.

  8. Hi, in case anyone had any trouble reading the accdb files as one emailer did, I've included a file .mdb compatible with access 2002-2003. Its got 1800 words with pictures but the size has ballooned up to 14mb (I do have to do something about that file size). It only contains the table, so you'll have to create a form (which really isn't that hard) so that you can see the word and the picture.

    Heres the link.
    http://4filehosting.com/file/90321/Shorthand-Anniversary-2003-mdb.html

  9. Oh, wow! That's an incredible amount of work!

    Is it possible to have it spit out individual GIF (or png or whatever else) files, one per word? If so, it would be very straight-forward for a php programmer (or perl) to make a proper online database. I'm not sure how good the Access to Other converters are, especially for images.

    (I've put "learn php" on my 5-mile-todo list; even read a book or two on it and wrote a simple program. So many wonderful things to learn, all of them distracting from each other.)

  10. http://www.mediafire.com/?efc2yy0jez2

    Here it is. Its not actually 5000 words by the way. Its 3819, the rest comprising of different endings like -es or -ed or -ing. Still it is likely the largest searchable database of shorthand outlines (only Anniversary sorry) that has been made.

    I'm gonna be on the lookout for ways of making the Access file into a user-friendly program, which requires nothing but a few files and an exe, but I'm not too sure where to start. Maybe someone on this forum can help.

    Now you can try using it, create a form it actually works pretty nicely.

    Michael.

  11. If you have a smartphone/pda (I have an imate ppc phone), then you'll appreciate having a dictionary on your phone. I looked around and found the best way to view and search the pdf on your phone.

    First you need Repligo Viewer (palm, ppc, symbian etc)
    http://www.cerience.com/viewers/pc.php

    Then you need this .rgo file.
    http://www.mediafire.com/?2ebuxaviogt

    The .rgo file is exactly the same as a pdf, but optimized for searching and viewing on a small screen. It can only be read by Repligo Viewer, but since this is free and one of the fastest around, then its really not a problem. The program takes around 3 seconds to search through the whole document, and loads really fast. If your device has 320×240 resolution (which is standard on nearly all pdas), than 100 times zoom will fit perfectly on your screen.

    Advise me on how you go.
    Michael.

  12. Yeah especially when I first started entering data, I wasn't too sure in my system, so I made a lot of mistakes. I went through the first 200 entries which is where most of the mistakes were, and fixed them up. I also widened the shorthand columns on the pdf and the pda version. So here are all the new files.

    pdf 3-column dictionary
    http://www.mediafire.com/?9dkfyqyfcd1

    database fixed
    http://www.mediafire.com/?8tzlzricrff

    rgo file for smartphone/pda
    http://www.mediafire.com/?aymrhitdrde

    Their might be some entries which are still off, but all in all its a pretty accurate database.

  13. thx – this searchable format s good for me as I have just started learning shorthand this week – doing it the way I expect most do, starting with the 1955 2nd edition Simplified and planning to go back through the 1929 Anniversary online version to top it up as seems appropriate – actually I did go through the online Anni version over the weekend while waiting for the paper book ( can carry it around and review ) and had expected few differences except the number of short forms, some phrasing and the sequence of instruction and practice – I, of course noticed the marks for long vowels that disappeared ( but nobody is stopping me from using them to clarify words not in sufficient context ) and that the changing the direction of a vowel to signify an R is missing – this last I am surprised by, this rule struck me at time as simple and easy to use and remember and in my ( limited ) experience did not see what the problem was with it ?!   So – what is the problem and what other 'surprising' differences am I not expecting between the 2199 and 1955 Gregg's ?

  14. I'd recommend avoiding trying to integrate Anniversary principles until you've completed the Simplified manual, unless you want to just start in Anniversary to begin with. That might even be preferable, if you're going to switch over anyway, so you can solidify the ideas from the start and not be reading briefs/phrases/outlines in the Simplified manual that conflict with the Anni equivalents, or trying to break old Simplified habits later. The Functional Method books by Leslie are excellent, if you choose to go that route.   Some other differences between the two (I switched after a few years of using Simplified when I decided I wanted the speed Anni can offer): * lots and lots of word endings/beginnings (-mity, -nity, -flect, -spire, recl-, magna- etc.) * TR principle: in a word beginning that has "tr" after it, such as "instr-", "contr-", the "tr" and vowel after it are omitted, with the first letter(s) being written above the rest of the word. "instruction" is "n-s" above "ksh" * many phrasing principles, such as "sure" being "sh" (I am sure = a-m-sh), "do not" being a "tn" blend * many, many briefs and "special forms" which are just more specific briefs   And there are even more principles/briefs/phrases for court-reporter types.   I personally recommend Anniversary over Simplified, if you have the time and determination. The reading material is *so much* more interesting and numerous. All of Simplified's material is dry, dry business dictation. Anniversary has literature, court proceedings, and the Functional books have famous speeches and much more varied writing materials.   Good luck, and remember to do a little every day 🙂

  15. Point well taken, NiftyBoy. I had two years of Simplified in high school but at the conclusion of the second year decided to adopt Anniversary. (In 1960 it was easy to obtain the Anniversary texts.) I used probably an hour a day to work my way through the Anniversary Manual and the 3rd edition of Speed studies that summer before starting college. In retrospect, I've never regretted making the transition. However, it's excellent advice to a beginner: If Anniversary is your ultimate goal, start with it so it's not necessary to "unlearn" outlines later.

    But no matter which edition of Gregg you're studying, don't rush through the Manual. If you read and practice for an hour daily and don't move to the next lesson before you're comfortable with what you've absorbed, you'll feel comfortable with the system and automatically write outlines correctly.

    Build your speed with material you've read and written from the plates at least once. (This is why Leslie's 2-volume Functional Method Manual is so good, the key gives you lots of pre-counted dictation and after taking it, you can compare your notes with the shorthand in the book.)

    You can't become a reporter in a matter of a few weeks. With daily practice a DJS writer would, I assume, be able to thoroughly familiarize himself with Anniversary within 3 months. However someone new to shorthand may require at least 6 months (and perhaps 9) to master all the theory in the Manual. (If using the 1929 Manual it's vitally important to also use the Speed Studies as well; if using the Functional Method Manual, it's not necessary but you may want to use the Speed Studies for immediate review after completing the two volumes.)

    To maintain your shorthand skill, I recommend either Functional Dictation or Gregg Speed Building for Colleges. If you aspire to be a verbatim reporter, the Expert series are excellent and offer a wealth of interesting reading material as well as additional shortcuts.

    Use shorthand daily. If you can't write the "correct" outline, write something readible. Does it really matter if you write "disjoined K – B" or "K-T-R-E-B" for "contribute" as long as you can read what you wrote?

    The drills required to become adept with shorthand may seem tedious at first, but as you master the method you'll derive much pleasure from your practice sessions.

    Like NiftyBoy I also believe Anniversary rules!!

  16. A couple of changes and clarifications to an excellent summary.

    Many here feel it's easy enough to learn Anni after Simplified that they recommend Simplified even if Anni is the goal. It's easier to learn, less likely to stop in frustration. Me, though, I went straight to Anni.

    If you use an outline frequently, take the time to look it up, so you don't get used to doing it wrong. For regular use, though, yeah, anything that you can read later is good. (Be sure to read it not too much later and clarify if necessary!)

    That's different from drilling a passage. Never drill a passage unless you have a good sample. You will reinforce the wrong outline, and it takes 5x the repetition to unlearn something.

    The Fundamental Drills on Andrew's site are a good supplement to the Anni manual, as are the Grade Readings on archive.org. I don't have Speed Studies or Functional so I can't compare.

    If reprinting or copying, try to get close to the right size, when learning. I tried using a 3/4 copy and it was hard to think and then write it at a larger size; it also blurred some necessary details of some of the outlines. It's worth the extra $6.

    The 5000 most common words for Anni on Andrew's site is excellent (and a lot of fun; some of those words aren't very common anymore). They got the order of the principles right; you really can write most things "well enough" after only a few chapters.

    Cheers!
    Cricket

  17. CricketBeautiful-1:
    > Oh, wow! That's an incredible amount of work!

    I'll say! Cheers!

    > Is it possible to have it spit out individual GIF (or png or whatever
    > else) files, one per word? If so, it would be very straight-forward
    > for a php programmer (or perl) to make a proper online database. I'm
    > not sure how good the Access to Other converters are, especially for
    > images.
    >
    > (I've put "learn php" on my 5-mile-todo list; even read a book or two
    > on it and wrote a simple program. So many wonderful things to learn,
    > all of them distracting from each other.)

    The program is already written! It's in a dozen lines of sed! It even
    converts unlimited ammounts of plain text input into Gregg! For online
    use you'd just do a php system call! Ack!! I don't even usually *use*
    exclamation marks!

    After over a year of silence I finally killed the Yahoo group for that
    project. And I don't have time for it anymore. But you're right; it's
    very simple to do. The hack-up proof of concept I had working just took
    a dictionary directory tree full of outline gifs that were sized and
    cropped with ImageMagick, and then did a straight substitution on input
    from word to html image tag, with phrases parsed first.

    Good luck!

Leave a Reply