Wednesday, November 13, 2019

Reclaim the Records Introduces the Online New York City GEOGRAPHIC Birth Index

Reclaim the Records Introduces the Online New York City GEOGRAPHIC Birth Index

The following is a quote from the latest Reclaim the Records newsletter:
INTRODUCING THE NEW YORK CITY GEOGRAPHIC BIRTH INDEX
A new tool to find people born in New York City in the late nineteenth and early twentieth centuries, especially if their birth records had spelling variants or poor handwriting
Hello again from Reclaim The Records! We’re that little non-profit activist group of genealogists, historians, teachers, journalists, open government advocates, and other troublemakers who fight for the release of historical and genealogical materials from government agencies, archives, and libraries.
Today, we’d like to tell you about some new historical records that we’ve acquired, which we’ve put online for free public use, for the first time anywhere! They’ve never been available outside of New York City before.
Introducing the New York City GEOGRAPHIC Birth Index! This record set is an index to all births in New York City from roughly 1880-1912 (or 1917-ish in some cases outside of Manhattan). But unlike a typical birth index arranged by surname or by date, this one is arranged by the child’s place of birth, the actual exact street address! Hence the term Geographic.
We think there’s about 2.8 million names in here, maybe more.

You can read more in the newsletter’s web site at: https://mailchi.mp/reclaimtherecords/introducing-the-new-york-city-geographic-birth-index.

Wednesday, November 6, 2019

Don’t Want to Lose (Parts of) Your Genealogical Data?

Don’t Want to Lose (Parts of) Your Genealogical Data?

The following is an article written by guest author Bob Coret and is copyright by him. The article is published here with the permission of Bob Coret:
Don’t want to lose (parts of) your genealogical data?
A recent research report by Genealogy Online shows that genealogists have a high risk of losing (parts of) their genealogical data when transferring a GEDCOM file from their family tree program or service to another family tree program or service. This is caused by the fact that most family tree programs and services do not follow the GEDCOM specification to the letter and because a lot of undocumented “user-defined tags” are used.
Recently, Nigel Munro Parker, made his GEDCOM validator GED-inline [http://ged-inline.elasticbeanstalk.com/validate] available for re-use. GED-inline reads a GEDCOM file and checks if the file follows the rules of the specified GEDCOM specification. You get a report nearly instantly (and free). Besides statistics it shows the number of warnings and user-defined tags, as well as a list of all warnings. Genealogy Online (a service for easily publishing your family tree online) recently deployed the open-sourced GED-inline in its infrastructure. Genealogy Online [https://www.genealogieonline.nl/en/] now checks all GEDCOM files it receives to publish online. When there are warning in regards to the GEDCOM file, Genealogy Online notifies the user.
In order not to lose genealogical information when it is transferred from “A” to “B”, agreements on how the information is recorded are of great importance. If both “A” and “B” adhere to these agreements, then the information will come across properly – without loss of information! Agreements about the format of genealogical information are laid down in the GEDCOM specification. The most recent GEDCOM version is 5.5.5, which is published on http://www.gedcom.org [https://www.gedcom.org/].
As a genealogist you do not have to dive into these GEDCOM specifications. The specifications are intended for the suppliers of family tree programs and services (more specifically, their developers). But as a genealogist you should make sure that the GEDCOM function of your family tree program or service adheres to the GEDCOM specifications! After all, if a family tree program or service does not adhere to the GEDCOM specifications, then there is a risk of information loss during the transport of the genealogical information!
As a genealogist you can check the quality of your GEDCOM too! If you’re not using Genealogy Online, just go to GED-inline [http://ged-inline.elasticbeanstalk.com/validate] directly and upload your GEDCOM. See how many warnings are in the validation report. The number of warnings says nothing about your genealogical information, you didn’t do anything wrong. The warnings relate to compliance of the GEDCOM file with the GEDCOM specification. If there are warnings, there is a good chance that the GEDCOM file will not be fully understood by another family tree program or service and that there is a risk of information loss!
Another number that you should pay attention to in the GED-inline report is the User-defined value. This number represents the number of lines in the GEDCOM file where a so-called user-defined tag is used. Such tags are valid within GEDCOM, but the meaning of this is not laid down in the GEDCOM specification. And often, these use-defined tags are not documented publicly. So if program “A” places a certain information in a user-defined tag, chances are that program “B” does not know what information it is and what it should do with it. In a best case scenario these values are included as a comment, in the worst case scenario, these values are ignored. So, the user-defined tags also increase the risk of information loss.
Genealogy Online’s ‘GED-inline validation statistics’ [https://www.genealogieonline.nl/en/GED-inline/] report show that 1,215,130,449 lines of GEDCOM were inspected, 8,129,466 warnings were given (that’s 0.7%), and 93,365,260 lines contained user defined tags (that’s 7.7%). With these shocking numbers, you have to wonder, just how much genealogical data is lost when transferred?
What can you, as a genealogist, do to reduce the risk of information loss?
If you – after checking the quality of your GEDCOM file – find that there is a risk of information loss, contact the supplier of your family tree program or service. Ask them to improve GEDCOM support (and minimize the use of user-defined tags and document them), so that parts of your genealogical data are not lost during export (and import)!
In your contact with the vendor you can send the GED-inline report of the validation of your GEDCOM file and the link to www.gedcom.org where the GEDCOM specifications are published. If the supplier does not consider the quality of the GEDCOM export (your genealogical data!) as important, it may be time to look for another family tree program of service.

8 Comments

“The most recent GEDCOM version is 5.5.5, …”
Informations from FamilySearch (by asking about 5.5.5):
“The Church of Jesus Christ has the copyright on the Gedcom Specification since 1987. There has not been a legal transfer of the rights we have to the Gedcom Specification.”
So 5.5.5 is not a legal GEDCOM version.
Like
    —> So 5.5.5 is not a legal GEDCOM version.
    I am not a lawyer but I don’t believe having a copyright has anything to do with version numbers under U.S. laws. I am not sure about other countries, however.
    In this case, Company A can create version 1.0 of anything and copyright the product. Company B can then legally create version 2.0 of the same thing. Company C can then legally create version 3.0 of the same thing.
    If either Company B or Company C then attempt to SELL their new versions, then U.S. copyright laws will be involved. But simply announcing a new and improved standard is never illegal. U.S. copyright laws only deal with the rights to copy a product and reproduce it elsewhere, not for simply suggesting improvements to something and then publishing the new improvements’ specifications.
    FamilySearch owns the copyrights for GEDCOM and probably will do so forever. However, that does not affect your right or my right or anyone else’s rights to suggest improvements.
    Like
This article says in other words “User defined tags are evil! The more lines with user defined tags your GEDCOM file has – the lower is its quality.”
But it is not as easy as it sounds.
There are some user defined tags like “_UID” you find in nearly every GEDCOM file which causes no problems at all.
User defined tags are a valid way intended by the GEDCOM standard to save data for which no other standard tag exists (home person, personal tasks, additional location information, …).
What should a vendor do, when users asking about “disturbing” user defined tags? Left out some of the information? No! The goal should be to write all user data in the GEDCOM file.
The better way is that a.) software should give an detailed import report of what data is ignored and b.) vendors should share informations about user defined tags (like German GEDCOM-L group do – see here: http://wiki-de.genealogy.net/GEDCOM/_Nutzerdef-Tag).
And believe it: standard tags are no guarantee for being not ignored by importing software. Sometimes the importing software has fewer capabilities and the user looses data for this reason.
Regards, Dirk (www.ahnenblatt.com).
Like
    Dirk, nearly all data can be stored in GEDCOM files without the use of user-defined tags. Just use the EVEN.TYPE or FACT.TYPE tags that are already defined.
    I have written many articles about different applications’ compliance (or lack thereof) with the GEDCOM 5.5.1 standard. I have also notified all the developers about the problems. Most of them are not interested in improving their GEDCOM compliance.
    Keith Riggle (GenealogyTools.com)
    Like
    I don’t agree.
    I’m in the same German GEDCOM-L group as Dirk. We have searched a way to export
    the german “Rufname”. It is no Nickname and no way to do it in any GEDCOM version. So we agreed to _RUFNAME as a new tag and it works fine for all represented developers of the GEDCOM-L group.
    Or locations that stored in a place management. We have agreed to this (a complete new record):
    0 @@ _LOC
    1 NAME {1:M}
    2 DATE {0:1}
    2 _NAMC {0:1}
    2 ABBR {0:M}
    3 TYPE {0:1}
    2 LANG {0:1}
    2 <> {0:M}
    1 TYPE {0:M}
    2 DATE {0:1}
    2 <> {0:M}
    1 _FPOST {0:M}
    2 DATE {0:1}
    1 _POST {0:M}
    2 DATE {0:1}
    2 <> {0:M}
    1 _GOV {0:1}
    1 _FSTAE {0:1}
    1 _FCTRY {0:1}
    1 MAP {0:1}
    2 LATI {1:1}
    2 LONG {1:1}
    1 _MAIDENHEAD {0:1}
    1 EVEN [|] {0:M}
    2 <> {0:1}
    1 _LOC @@ 0:M
    2 TYPE {1:1}
    2 DATE {0:1}
    2 <> {0:M}
    1 _DMGD {0:M}
    2 DATE {0:1}
    2 <> {0:M}
    2 TYPE 1:1
    1 _AIDN {0:M}
    2 DATE {0:1}
    2 <> {0:M}
    2 TYPE {1:1}
    1 <> {0:M}
    1 <> {0:M}
    1 <> {0:M}
    1 <> {0:1}
    How can you manage this only with tags from any GEDCOM version.
    Greetings from Germany, Stefan.
    ()
    Like
The webside destroy my posts Stefan
Like
Stefan, which major apps or websites outside the GEDCOM-L group are using your new record type? Family Tree Maker? Roots Magic? Family Tree Builder? The problem with user-defined tags is that other apps can and will ignore them.
You can represent any type of name, not just nickname, with the NAME.TYPE structure that is mandatory, anyway. The PERSONAL_NAME_PIECES with NAME_PIECE_NICKNAME is optional. So, for example, you could have:
n NAME
+1 TYPE RUFNAME
You can have as many name structures attached to an INDI record as you want.

Forget Paper. Forget Hard Drives. Forget CD and DVD Disks....Use...Glass

Forget Paper. Forget Hard Drives. Forget CD and DVD Disks. Forget Most Everything Else. For Long-Term preservation, Use a Piece of Glass.

Genealogists frequently discuss the best ways to preserve family tree information so that it can be read and perhaps updated by future generations. Some people plan to save everything on paper so that “it won’t become obsolete.” Of course, they forget that paper is probably the most fragile storage medium of all, easily destroyed by water, humidity, acids in the paper, fire, insects, and a variety of other dangers.
Probably the greatest threat to data storage on paper is simply fading ink. Most paper prepared with today’s paper and today’s inks will be unreadable within a century, perhaps much less time than that.
Floppy disks were the storage medium of choice for some number of years ago but have since fallen into disfavor. The magnetic information of floppy disks doesn’t last forever. Even worse, floppy disk drives are rapidly disappearing. Most of us doubt that there will be any floppy disk drives available to read the disks within the next decade or two.
A better(?) solution is to record the information on CD-ROM or DVD-ROM disks but that has similar problems. These plastic disks also do not last forever, especially those that are recorded individually on today’s computers.

(CD and DVD disks manufactured in factories do preserve the information for many more years than those made individually on a home computer. You can read my earlier articles at https://blog.eogn.com/2016/05/24/your-cd-collection-is-dying/ and at https://blog.eogn.com/2017/07/31/the-demise-of-cds-and-dvds/ for more information.)
Several newer technologies hold a lot of promise but are not yet in widespread use. One that looks especially promising is a new storage medium optimized for what industry insiders like to call cold data — the type of data you likely won’t need to access for months, years, or even decades. It’s data that doesn’t need to sit on a server, ready to be used 24/7, but that is kept in a vault, away from anything that could corrupt it.
The new technology is called “Project Silica.”

A piece of silica glass measuring 7.5 centimeters (3 inches) by 7.5 centimeters (3 inches) by 2 centimeters (0.8 inches) can store at least 75.6 gigabytes of data, photographs, music, or even high-resolution videos.
The movie industry has many thousands of films that need preservation but also keep bumping up the limitations of today’s storage methods as do genealogists. For instance, the Warner Brothers studio has been safekeeping original celluloid film reels starting in the 1920s, audio from 1940s radio shows and much more, for decades. Think about classics like “Casablanca,” “The Wizard of Oz” or “Looney Tunes” cartoons: how can they be preserved?
Together, Warner Brothers and Microsoft have developed a solution to preserve those original assets in perpetuity. The new technology is first being used to store a copy of the 1978 movie “Superman” on a small glass disc about the size of a coaster. If successful, the same technology should be useful for storing family history information as well as for thousands of other uses.
You can read more about this technology in an article by Janko Roettgers in the Variety web site at: https://variety.com/2019/film/news/project-silica-superman-warner-bros-microsoft-1203390459/.
Of course, two present limitations might remain even in the future:
1. Will any devices capable of reading “Project Silica” glass still be available a few thousand years from now?
2. Will anyone a few thousand years from now have any interest in a very old “Superman” movie or even Looney Tunes?
My thanks to newsletter reader Pierre Clouthier for telling me about this latest technology.

4 Comments

If one is just printing black on paper and uses pigment ink it will not fade. it will outlast the paper.
Like
    Would you provide more information re: printing with pigment ink. Is this ink available for use in home printers?
    Thank you.
    Like
And when the glass breaks?
Like
—> And when the glass breaks?
Exactly the same thing as happens when a hard drive fails or a magnetic disk loses magnetism or a piece of paper is damaged or destroyed by any number of problems: it becomes useless.
That is the reason why I have written many times about the reason you want to ALWAYS create two (or preferably more than two) copies of everything that is important to you and then store them in two (or preferably more than two) locations. In fact, I store my important files in three or four locations and I wouldn’t be surprised if some people store things in ten or more locations. Those widely-separated multiple copies won’t all go bad at once if you have a good backup plan.
Regardless of the storage media used, every manager of every significant data center never depends upon only one copy of anything that is important. Individual consumers can learn a lot from data center managers.
L.O.C.K.S.S. – “Lots of Copies Keeps Stuff Safe”
See https://duckduckgo.com/?ratb=c&q=site%3Aeogn.com+%22L.O.C.K.S.S.%22&t=brave&ia=web for a list of my past articles that mention the need for L.O.C.K.S.S. – “Lots of Copies Keeps Stuff Safe”.