Paradigms of Information Retrieval


What a fancy title for a simple concept.

I think of information as organizable in 3 ways, ordered by increasing flexibility:

  • Folders
  • Tags
  • Search

These methods can and do overlap.

Folders

Information is stored in a hierarchy and is inherently ordered.

This is good because you get efficient iteration and a clear hierarchy.

This is great until it isn’t. Not everything is clearly orderable. Sometimes there is more than one natural order.

Folders are analogous to arrays in programming languages.

Tags

Information is tagged with keywords or some other map.

Tags are analogous to maps in programming (A.K.A. hash tables, dictionaries, hashmaps, etc) because they are maps, and carry the pros and cons of maps.

Tags are, above all, flexible. You can come up with any sort of mapping scheme, and even implement some sort of ordering (for example, by tagging each piece of info with an integer, A.K.A. an array).

Their disadvantages are specific to the mapping scheme used, but in a technical sense, they’re strictly more capable than folders, since folders are isomorphic to a specific tagging scheme.

This method is only possible because modern computers are really good. Rather than organizing information, you just search for what you need.

You can impose structure on your data (such as tags and folders), and either use or ignore it when searching.

There’s not much to say here. We’ve all used google, and the programmers among us have tried grep.

Search is analogous to a really fast computer that makes a lot (but not all) of data structure optimization pointless.

When the data gets big, you need to start imposing some structure to efficiently find anything.

Errors

If you find any errors, or if I’m wrong, or you want to tell me something, feel free to tell me.

Related Posts

Use of emphasis in speech

Generating a lot of language data with a theorem prover

"Litany Against Fear" in Present Tense

When it's time to party we will party hard

these are people who died

divine carrot

the frog

what it’s like to get nail phenolization

Why 0 to the power of 0 is 1

Lines and Points are Circles