Contextual annotation

Adding location information to captured images and video adds a nice way to browse memories by. Geotagging, as the practice is often called, is picking up speed as handheld GPS devices become more common. While body area networks have not really broken through GPS modules are slowly creeping inside cameras and camcorders themselves. In addition to location there are lot of other type of information that would be fun and even interesting to inspect later on. 

Finding easy ways of producing and injecting descriptive data is potentially one of the great differentiating factors between camcorder manufacturers in the future. In addition to the usability factor there are also some technical hurdles. As with geographic coordinates support in the data container format and software interfaces are required for all of the applications. Happily it is possible to embed metadata in extensible XML formats directly inside common delivery formats, which e.g. dated viewing software would just ignore. 

The metadata creation is a real problem for amateurs and pros alike. Digitizing the most often used solution, production notes written on a piece of paper, tends to be tedious handwork, even if the document was not lost in the process. Obviously good planning helps and notes can be skipped all together or written right into the planning document using a laptop computer. Nevertheless avoiding extensive logging work before, after and especially during the shoot would be a clear improvement to the production process. 

Tricky part of the equation is finding innovative ways to add contextual data on the fly without disrupting shooting. Cameras just do not have an user interface for adding textual descriptions. Even if there were keys writing down even just the unforeseen events would either take away from the thrill of the experience shot or hamper the workflow exceedingly. Predefined contextual profiles ie. tags for common situations would be the obvious solution.

Selecting the right tag using camera menus could be a feasible solution. Of course nowadays much of the logging can be automated. Current tagging solutions are often embedded inside the ingestion process. It would be much harder to find cues for detecting the right context for each media item while shooting right on the spot. Autotagging solutions can rely on imaging techniques like face detection and motion analysis. To design an automated logic solving this problem would require a good understanding of what people shoot and why they do it.

Traditional logging software tend to be dumb in the sense that they can not make intelligent guesses of the context. Building an intelligent neural network algorithm that can learn to combine input from different sources would be one solution. In this day and age it would be feasible to import data also from any other networked applications. List of alternatives is endless and limited only by creativity in adding kinds of sensors and data storages. There might also be interesting possibilities found if trends were analyzed later on.

The applications should focus on enhancing the viewing experience. One fun idea would be to inject a link to calendar data or micro-blogging information.  Likewise weather data is widely available and can be fetched based on date and location information. Clearly there will be some very nice mash-ups in the brewing once  network interfaces become more prevalent in cameras. Maybe the evolution will start off with the wi-fi function embedded in the Eye-Fi memory card.

Fideocam concept

Fideocam is a pioneer in automating video tools for personal experience capture.



%d bloggers like this: