Everything is a Subject

Wednesday 26 January 2011

Emnekart 2011

The Norwegian Topic Maps conference, Emnekart 2011, is on february 10.

This year it is a part of Software 2011, which is the main conference for The Norwegian Computer Society.

Here's a few highlights from the english part of the program:

Best Practices for Publishing Linked Open Data
Graham Moore, Networked Planet

Many organizations and government initiatives are advocating and providing support for the publishing of 'Data on the Web'. This is a paradigm shift from publishing HTML pages for people to read, to publishing raw data that can be processed and interpreted by machines and people.

While it may seem a trivial activity to put data on the web, the fact is that many organizations struggle to do this in a way which is useful for their community.

The talk discuss and demonstrates the different levels in quality of publishing linked data on the web, ranging from simple files to complete Linked Data endpoints.

Topic Maps applications for Android and Iphone

Jan Schreiber, Ravn Webveveriet

Jan has written tmjs, - a Topic Maps engine in JavaScript. He will talk about how Javascript and HTML5 makes development of semantic applications independent of platform possible. He has some pretty exciting applications to show too.

He will show how to visualize geotagged Topic Maps with maps and in an augmented reality browser.

Augmented reality means that you look through your mobile camera and get an annotated view of the world around you on the screen.

Have you even considered how to present a topic map on a limited mobile screen?

- Jan has.

Semantic Repowering the Afghanistan War Diary

Benjamin Bock and Thomas Efer, Topic Maps Lab - University of Leipzig

In summer 2010 WikiLeaks released one of the largest classified military leaks in history: The Afghanistan war diary. In the news it was labeled a report, but for engineers it was merely a big and messy CSV file with largely unnormalized data. Publishers like Spiegel Online spent several months before the release, manually analyzing the data. With semantic tools this could have been much easier…

Visualizing Topic Maps with Silverlight PivotViewer

I also look forward to the lightning talk where Graham Moore gives a demo of how to visualize Topic Maps with the Silverlight PivotViewer.

Friday 5 November 2010

Semantic ambiguity and how human communication fails - except by accident

It's been a busy week at the Topicmapmail-list, where a +50 message thread developed, starting off as an announcement of the Afghanistan War Diary as a topic map in Maiana (made from WikiLeaks data).

The discussion went off in several directions, and spun into a discussion of typing and the inherent messiness of trying to model the world.

- We're kind of back at the start. In a messy semi-structured world with information overflow, - what kind of technology can help us find a way?

Steve Pepper:

The categories of human knowledge are better expressed using a prototype model than a criterial-attribute (or "Aristotelian") model. In such a model, roles and types are not sharply differentiated, but rather exist on a continuum.

Alexander Johannesen questioned the basis of linked data and subject identification:

So the question becomes; can we still rely on our TM way of subject identification? I'm not so sure. Things change. And here's the catch; the more you describe that thing, the more you try to pin it down its definition, the less likely it is for that thing to fit whatever thing you need in what you're modelling. And the less likely it is that that model truly represents reality, so there's a whole scale of inherit dis-ambiguity that you need to have in mind when you knowingly have to make a million compromises while modelling.

To what degree do we need things to be correct vs. useful? And, in the end, is it useful that things aren't correct?

Andrew S. Townley had thoughts on scalability and how this can work outside a controlled environment:

Again, once you start trying to correlate statements about things made by millions of people each with thousands of overlapping but inconsistent assumptions, this stuff matters. In a controlled environment or walled garden, you have a lot more leeway with "useful", but I don't think that's good enough in today's world with over 1 billion addressable pages added to the Web every day.

I think it's important to (continue to) talk about these issues now while there's still a chance of influencing how people try to deal with a world with that much data. The less retrofitting and rectifying that needs to be done, the easier it will make things for everyone. Most of the people churning out all that content have no idea these problems exist. After all, they have Google and the magic search box. All they need is just a little bit more link juice and social proof... ;)

Patrick Durusau followed up with a blogpost about Semantic ambiguity:

Since we are trying to communicate with other people, there isn’t any escape from semantic ambiguity. Ever.

It all led me back to Wiio's laws, which I have revisited many times before. - So here's some friday edutainment for those of you which haven't read Wiio's laws on

How all human communication fails, except by accident:

Communication usually fails, except by accident.
If a message can be interpreted in several ways, it will be interpreted in a manner that maximizes the damage
There is always someone who knows better than you what you meant with your message
The more we communicate, the worse communication succeeds

I see the laws as both as a serious warning about how messy human communication is, and as black humor for people trying to do this for a living.

Most of Wiio's work, and the information about his work is ironically in Finnish, which I think most people on this planet doesn't understand very well...

Jukka Korpela has however written an excellent commentary of Wiio's laws

Professor Osmo A. Wiio is a Finnish researcher of human communication. He studied, among other things, readability of texts, organizations and communication within them, and the general theory of communication.

In addition to his academic career, he has authored books, articles, and radio and TV programs on technology, the future, society, and politics. He formulated "Wiio's laws" when he was a member of the Finnish parliament.

Monday 13 September 2010

Food traceability system using RFID and Topic Maps

My Google alert found me an article about a food traceability system combining RFID and Topic Maps.

The system is for Spanish ham from the Teruel province (which is supposed to be excellent, and has a Denomination of Origin status):

Free Traceability Management Using RFID and Topic Maps

The article is from ECIME 2010 (the 4th European Conference on Information Management and Evaluation), but I have not found any info besides conference program and abstract.

According to the conference website "The proceedings of the above conference are now available to purchase in CD-ROM format only".

However interesting, - I'm not that keen on spending £50 to get a CD-ROM in the mail.

Open Access publishing is the way, that's for sure...

Friday 27 March 2009

Wikipedia as a good PSI-source?

I stumbled upon an interesting discussion in the blogpost Wikipedia - A Democratic Gold Standard for Topic Maps, where Vegard Sandvold suggests that the Topic Maps community "should adopt Wikipedia as it’s democratic and user-generated repository of topic PSI’s". (Lars Marius Garshol wrote a good blogpost about the general idea behind PSIs)

Steve Pepper disagrees, and argues that ideally the PSD (Published Subject Descriptor) should incude the minimum of information needed to unambiguously identify the subject.

Robert Engels then enters the discussion and argues from the RDF point of view.

My view is that I would currently use Wikipedia, because on some subjects it's the best source I got. I agree with Steve Pepper, but imagine that it could be useful in some contexts to be a bit fuzzy on purpose. A widely defined and a bit fuzzy subject might be exactly want we want, to be able to "start a conversation".

Friday 13 March 2009

Exploring Semantic Mashups in the Wandora Workshop at Topic Maps Norway 2009

I really look forward to the Wandora workshop at Topic Maps Norway 2009 / Emnekart 2009 on March 18, as I have wanted for some time to play a bit with Wandora.

Wandora is an Open Source Java application made mainly for building and managing topic maps, but I think of it as a more general semantic toolbox, and think that exploring Wandora as a semantic extraction tool will be fun.

Wandora has a graphical user interface and several data storage options. Wandora both reads and exports the Topic Maps formats LTM and XTM along with the N3 RDF-format, which should make it a very useful toolbox.

The workshop will explore Wandora as a tool for extracting information from open web sources using some of the many built-in extractors to generate topic maps. It will demonstrate how to use Wandora to do semantic mashups. This is a hands-on workshop, which I imagine should be interesting both to TM developers, Semantic Web developers and developers who knows web 2.0 style mash-ups.

I have a dream of one day converting my well-tagged mp3-collection to a topic map, mash it up with open music information, and explore the new exciting possibilities for navigation and search, which would make iTunes look rather dull.

The workshop will focus on a few of the many interesting Wandora extractors to generate and merge topic maps. The list of available Wandora extractors is impressive, and keep on growing with every new release:

MP3 ID3 metadata
JPEG metadata
PDF metadata
FreeDB (music CD metadata)
Last.fm XML feeds
Internet Movie Database datafiles
Converts and imports any SQL database to a topic map
BibTeX
Flickr
YouTube
Digg and Del.icio.us
Geonames
Wikipedia extractor and a more general MediaWiki extractor
Wordnet
OpenCalais classifier
OpenCyc extractor
RSS 2.0 and Atom feeds
Convert emails and email repositories to a topic map
Convert file system structures to a topic map
Microformat extractors:
- Convert geo microformat snippets to topic maps
- Convert hcalendar microformat snippets to topic maps
- Convert hcard microformat snippets to topic maps

A Vision for a Topic Maps World

Graham Moore is giving a presentation at Topic Maps Norway 2009 / Emnekart 2009 next week, which is not to be missed:

A Vision for a Topic Maps World

Graham Moore, NetworkedPlanet

Topic Maps has been successful in delivering value in the context of content management, intranets and web publishing. In these contexts it has provided value in terms of improved navigation and findability of content. However, the scope of these projects has been limited, and it could be argued that Topic Maps has simply created better managed, and more useful silos of content. This talk presents a vision and concept for enabling Topic Maps in a global context.

We describe how the fundamental concept of Topic Maps, the separation of identity from addressing, can be taken and utilised in a global scale. This vision includes how people, who have invested in Topic Maps in the small, can contribute and benefit from this step change in the scope of Topic Maps usage.

Saturday 17 May 2008

Published Subjects and global identifiers

Dataforeningen arrangerer et møte om Published Subjects og globale identifikatorer i universitetsbiblioteket tirsdag 27. mai kl 16-18.

English translation:

The Norwegian Computer Society is planning a meeting about Published Subjects and global identifiers from 16 to 18 on May 27th. The program is quite exciting with four lightning talks, but the presentations are planned to be in norwegian. (We would probably be able to reconsider this and talk english if somebody not understanding norwegian would like to join us).

The meeting will be held in the electronic classroom at the University Library in Oslo.

Steve Pepper (Ontopedia) starts with a quick introduction of the need for shared global identifiers and an introduction to Published Subjects, where he also explains the terminology (PSI, PSD, ...)

Are Gulbrandsen (USIT) presents known published PSI sets and a few unresolved publication and discovery issues. He also discusses potentional sources of PSIs (for instance GREP, ISBN, Wikipedia, LinkedIn and excisting thesauruses).

Alexander Johannesen (Bekk) continues were he left off at Topic Maps 2008, Visions for a Topic Mapped Library, and wants to discuss the use of PSIs from a library perspective. (He has also promised to give us a quick overview over what they have manged to do at The New Zealand Electronic Text Centre (NZETC), Victoria University of Wellington. (NZETC got the Topic Maps Project of the Year 2008 award).

Stian Danenbarger (Bouvet) also continues from his Topic Maps 2008 presentation: Published Subjects: Small Pieces, Meaningfully Joined, and wants to focus on how we can add context to the discovery of PSI sets. - Who is using a PSI set, and how is it used?

More info in Norwegian