Two days training: 60% theory + 40% practice
Big Data is the hype of the moment in ICT and marketing. Since its inception in 2007, Apache Hadoop has been looked at as the de facto standard for the storage and processing of big data volumes in batch.
But every technology has its limitations, and this is no different for Hadoop: it is batch-oriented and the MapReduce framework is too limited for handling all types of data analysis within the same technology stack.
Apache Spark makes big data easy to implement, it was developed in 2009 at the AMPLab (Algorithms, Machines, and People Lab) of the University of California in Berkeley, and donated to the open source community in 2010. It is faster than Hadoop, in some cases 100 times faster, and it offers a framework that supports different types of data analysis within the same technology stack: fast interactive queries, streaming analysis, graph analysis and machine learning. During this two-day hands-on workshop, we discuss the theory and practice of several data analysis applications.
Notebook technologies like Zeppelin, Jupyter, Spark Notebook and Databricks Cloud allows you to go from prototypes into production workflows in one go. Notebooks allow to implement “repeatable research” by mixing executable code with comments, images, tables, links, …
We’ve chosen Databricks Cloud as notebook technology because it is the most mature enterprise-ready notebook technology on the market at this moment. It’s available on AWS and Azure.
This course supports Spark 2.x using Python & Scala.
AI FOR BUSINESS
Half-day seminar for business people
This seminar brings you up to speed in the state-of-the-art in artificial intelligence, and offers you a guided tour through the fascinating world of automation, (chat)bots, data mining, neural networks, (un)supervised learning, machine learning, deep learning and data science.
AI and its subdomain Data Science are in the news almost every day: from how “sexy” the job of a data scientist is to the “infinite” possibilities of Artificial Intelligence.
What do the internet gigants like GOFA do with AI?
Beneath all the hype that surrounds artificial intelligence (AI), automation and data science, there are real breakthroughs in AI happening at this moment, and they are transforming the way we do business. AI developers are creating software that doesn’t just do what is programmed for, but is able to anticipate the needs of customers and users through a combination of pattern recognition, knowledge mining, planning and reasoning.
How to tackle your AI projects? What kind of data architecture is optimal ?
This seminar will explain how Artificial Intelligence evolved throughout its “winters” into the narrow AI we all use in our daily life. Some studies predict that AI will change our jobs and our economy in a huge way. Even today, companies can already plug into AI from the cloud to start enhancing your employees.
Data Science is a subdomain of Artificial Intelligence which is already widely adopted by large organisations. It helps those organizations to define the next best offer to their customers, predict those customers with high probability in churning, segmenting customers into segments etc. We will give an overview what it means to be a data scientist, how data science can be used for business and how to start adopting data science within your organisation.
KSTREAM AND KSQL
Two days training: One day of theory, additional day with exercises on demand
Apache Kafka is an open-source stream-processing software platform. It is used for many use case where data needs to flow in real time and be readily available.
Kstreams and KSQL are libraries for building streaming applications. These applications aim at transforming input Kafka topics into output Kafka topics. Kstream lets you do this with concise code in a way that is distributed and fault-tolerant; while KSQL doesn't require any code.
Learn more soon about the content of this course
Two days training: one day theory + one day practice
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. Learn how to master this in-high-demand skills that will change your software development processes
Topics (agenda soon online)
Learn soon more about the content of this course
Two days training: one day theory + one day practice; can be shortened to half a day on demand.
We still offer this course for legacy reasons.
The rise of the internet, social media and mobile technologies and in the very near future the Internet of Things ensures that our data footprint is rising fast.
Companies like Google and Facebook were quickly confronted with massive data sets, this led to a new way of thinking about data. Hadoop provides an open source solution based on the same technology used within Google. It allows you to store and analyze in a scalable way huge amounts of data to create new insights.
With this workshop we want to give everyone the opportunity to get acquainted with the Hadoop Ecosystem.