HomeShare
Magazine

Tengu: data scientists’ toolbox for easier and faster Big Data analysis

In a (big) data-driven world, a lot of pressure is put on data scientists to make sense of the massive amount of information companies have stored.

Scroll

To do so, they must combine and configure the right set of tools – for data storage, processing, analysis, visualization, etc. –, programming languages and models that match the company’s specific needs. While most data science platforms tend to limit the number of compatible tools, the imec spin-off and imec.istart company Qrama developed Tengu: a generic, open-source data science hub that offers professionals complete freedom to choose the tools that best suit each of their data science projects. The platform is currently being used as part of imec’s City of Things program in the Belgian city of Antwerp.

A new, IoT-driven chapter in Data Science

By 2020, it is expected that 44 Zettabytes of data will have been created – requiring storage capacity equivalent to 20% of the size of Manhattan. Moreover, with the advent of the Internet of Things, that data is being produced and collected at a higher speed and in numerous different formats – from the traditional audiovisual and text, to novel types brought by a multitude of sensors, radars and other (smart) devices.

Making sense of all this data is essential for the success of today’s businesses. In order to do that, data scientists need to combine and explore data from various sources, code and build models to leverage that data, deploy those models into production and present the results in a comprehensible way, either through a report or a web application. Data science platforms offer a generic environment where they can easily and automatically match all these tools without the need to manually configure them, allowing data scientists to focus more on the business intelligence and less on programming.

One Data Science solution does not fit all

The Internet of Things has also caused the number of data science tools to escalate, making it difficult for platform providers to keep up with the constant change and innovation. As a result, most companies often opt to limit the number of compatible tools – usually proprietary solutions.

However, each data science project requires a unique combination of tools for each step of the process – which isn’t always possible with current state-of-the-art platforms. With that in mind, Thomas Vanhove and Gregory Van Seghbroeck developed Tengu, a flexible platform to quickly set up, configure and manage Big Data environments.

Your Data Science blank canvas

Tengu is a digital work environment for Big Data projects. It offers a generic, flexible framework for data scientists to combine and experiment with different software, without the need for manual configuration and set up – something that usually takes up to a few months before they could actually start working on the specific project. The platform offers both batch and stream-oriented data processing, as it keeps up with the changes brought by the Internet of Things.

Co-founder Thomas Vanhove explains: “Traditional Data Science was based on big data sets, all being collected at once – and requiring a so-called batch analysis. With the Internet of Things, the focus has shifted from analyzing big amounts of data, to being able to process different types of data as they are produced – which is called ‘stream analysis’. Because batch analysis still covers around 80% of the market, most data science platforms are not yet optimized to support stream analysis. That is where Tengu can really make a difference.”

The platform is open to all kinds of Big Data processing, storage and cloud components, preventing vendor lock-in.

“Our users are free to use whichever open-source or proprietary solution they want. Moreover, after the data science work environment has been built, it can work independently from our platform,” emphasizes co-founder Gregory Van Seghbroeck.

From an academic need to a business solution

In 2013 Thomas Vanhove was a PhD student at imec - Ghent University, where he needed – as part of his research – to experiment with several data science tools. Quickly realizing the available platforms could not offer the flexibility he required for his work, he decided to develop his own data science platform.

As more and more fellow PhD students and other colleagues requested his platform for their own research, Thomas began to inquire about the market feasibility of his idea and, in 2015, joined imec’s Opportunity Recognition Workshop (ORW) [link to new imec page on ORW], a three-day workshop for researchers and PhD students to assess their work’s potential from a business and economic perspective. After being assigned a business coach to help develop their idea, Thomas Vanhove and Gregory Van Seghbroeck launched the start-up Qrama to commercialize the Tengu platform and in May 2016 were accepted in the imec.istart Business Incubation Program.

Being a spin-off company from imec - Ghent University, Qrama acquired a worldwide exclusive license from both entities to use the code behind the Tengu platform, which is now being used by imec in the context of the City of Things (CoT) program.

Philip Leroux, Operational Lead at the CoT Data Team of IDLAB – Ghent, highlights the value of using the start-up’s innovative technology: "The imec City of Things (CoT) project envisions to be an open and future proof large-scale smart city living and technology lab in the city of Antwerp. With Tengu in the core of the CoT R&D Data Platform, the effort to set up and maintain all required IoT data components is greatly reduced. At the same time, Tengu allows the CoT R&D Data Platform to easily scale up for future research use cases and to remain very flexible and open in order to support novel (big data) technologies and server configurations."

Network support, the hidden gem of imec.istart

Both founders praise the support of the imec.istart program to help them scale the company.

“Being able to get €50.000 at such an early stage is of great help. Thanks to this financial support, we were able to hire two full-time developers less than one year after we founded the company,” states Thomas Vanhove.

Imec.istart’s workshops were considered of great added value by both entrepreneurs as well, as it helped them get acquainted with all the commercial, legal and financial aspects of starting and maintaining a business.

“We are both technical people, so we were completely unaware of what it entails to run a business,” acknowledges Gregory Van Seghbroeck.

However, according to the two innovators, the holy grail of imec.istart lies in its robust network of experts and partners, as it offers unique opportunities, and exclusive software and business deals that really “relieved some of the pressure from us, by being able to outsource some of our tasks to imec.istart partners.” (Thomas Vanhove).

The next steps

After having already secured two paying customers, Qrama’s goal is to sign up ten customers in Europe by the end of 2017, before starting their internationalization process to the United States in 2018. They’ve already attended several conferences in the US, to get to know the market and potential leads.

“We are mostly targeting both consultancy companies and large enterprises that have dedicated data science teams,” discloses Gregory Van Seghbroeck.

Following their acceptance in an innovation project by VLAIO – the Flemish agency for innovation and entrepreneurship –, the start-up is currently in need of extra support from two more full-time developers. In order to achieve the internationalization plans and successfully conduct the innovation program – and to be able to further develop their platform – the company will also be looking for additional funding.

Meet the founders

Thomas Vanhove is a co-founder and CEO at Qrama. He has a Master’s Degree in Computer Science from Ghent University. In August 2012 he started his PhD at the IBCN research group (now IDLab), researching data management solutions in cloud environments. During his PhD he developed an initial version of the Tengu platform to enable him to easily experiment on big data frameworks.

Gregory Van Seghbroeck is a co-founder and CTO at Qrama. He has a PhD in Computer Science Engineering from Ghent University, where he currently still works as a post-doctoral researcher, having been involved in several national and European projects. His work has been published in numerous international journals and conference proceedings.

Related