I’ve started work on creating some debian/ubuntu packages of Garlik’s 4store RDF database, currently there is source and soon there will be some RPM’s, so I thought I’d help bring .deb’s to the party. The guys from Garlik are very helpful and can be contacted in #4store on freenode. According to their site:
4store was designed by Steve Harris and developed at Garlik to underpin their Semantic Web applications. It has been providing the base platform for around 3 years. At times holding and running queries over databases of 15GT, supporting a Web application used by thousands of people.
I’m looking at this product to build the core of system that will help create a scalable, repeatable, easy to use data store system. The idea behind this is to help organisations achieve the goal of opening their data. @johnlsheridan stated that open source is the perfect system for promoting and achieving the goal of open data and open standards, all three go hand in hand. My aim is to remove the “black magic” currently involved in setting up some of these systems and help people free their data easily, which should benefit us all in the long run.
4store has some very interesting features such as the ability of nodes to clustering database nodes for scalability. It also has a SPARQL http server which provides a RESTful API. The fact that all the data can be accessed by a URI means this data is cacheable, which in turn means you can scale this system very easily with proxies and load balancers. The beauty of the system though is that none of these extra features are a requirement, so if your a small organisation with a moderate amount of data you don’t need to break the bank to do it.
Why free your data?
Well data is key for communication and the traditional business model is to guard your data and protect it. However the internet generation is here, and they’re changing the world! As soon as you put your data in a open standard anyone can use it. Using linked data means that someone looking for information about a certain item can find your data easier and more importantly can be interpreted by machines easily which improves search algorithms. If this item searched for is a product it may get you a sale. If the item is data about health it might even save a life. Data is diverse and so is its uses, when you open your data people may find a new way of using it and progress is made. Opening your data allows people create mashups and pull in data from a multitude of sources, giving a accurate and informative view on the requested subject matter. Open source, open data and open standards if embraced by enough people will help everyone move forward. I’ve only skimmed the surface of what these systems can be used for and trying to describe the importance of open data is incredibly difficult because data can be anything! If you want to know more I’d suggest looking at these TED speaches:
and an excellent example of open data in use: