DevConf.cz 2021: Full Schedule

2:45pm CET

Inside the UDS deduplication index

The UDS index is at the center of the deduplication capability of the
dm-vdo device mapper target. The UDS index holds the metadata that
allows dm-vdo to find duplicate blocks quickly with minimal storage
overhead. The index is highly optimized to take advantage of common
properties of real world data sets. This might be useful in other data
processing environments besides block storage as in dm-vdo.

This presentation will explore the inner workings of the UDS index and
the tradeoffs that lead to effiency but also determine what it can and
cannot do.

Speakers

John Wiele

Senior Developer, Red Hat

Software developer since time immemorial.

Thursday February 18, 2021 2:45pm - 3:25pm CET
Session Room 4

Storage / Ceph / Gluster, Talk

3:30pm CET

Using NGINX as a CDN for Ceph Radosgw

Ceph is a distributed unified software-defined storage solution often used as a S3 Object Storage backend using Ceph RadosGW.
NGINX configuration is used in order to cache Ceph RGW objects at the NGINX layer, and offload intense workloads in a secure way, as NGINX is a performant web service.

Speakers

Or Friedmann

Software Engineer, Red Hat

Software Engineer, part of the Ceph RGW team

Shon Paz

Sr. Solution Architect, Red Hat

A Data Solutions Architect, experienced in developing and maintaining a cloud-native approach based on the agile methodology for all modern IT supply chains such as Distributed Cloud, Data & AI/ML, Automation, Cloud Native Development, Telco.

DevConf 2021 Nginx CDN for Ceph RadosGateway pptx

Thursday February 18, 2021 3:30pm - 3:55pm CET
Session Room 4

Storage / Ceph / Gluster, Talk

4:30pm CET

Keep calm and store your data in OCS

Your data is the most critical resource in your OpenShift/K8s cluster: without it you cannot continue to function or serve your customers.
OpenShift Container Storage (OCS) protects your data by addressing three difficult problems: high availability, backup, and disaster recovery. OCS is an OpenShift operator based on Rook and Ceph that provides cloud native software defined storage for your applications, and is fully integrated into OpenShift.
In this session we will discuss:
Availability zone failure protection in two, or three availability zone setups.
Backup and restore for your cluster data, mitigating data corruption, or reverting unwanted changes
Disaster recovery: protecting you from a complete data center failure.
We will go over recommendations and best practices learned from running OpenShift clusters in the public cloud and on-premise.

Speakers

Orit Wasserman

Sr. Principal Software Engineer, Red Hat

Orit is an experienced software engineer who is passionate about open source and infrastructure with extensive experience with distributed systems and storage. She is OpenShift Container Storage Architect at Red Hat focusing on storage for Containers, hybrid cloud, multi cloud and... Read More →

Thursday February 18, 2021 4:30pm - 4:55pm CET
Session Room 4

Storage / Ceph / Gluster, Talk

5:00pm CET

Journey Of Devops Teams in a Corporate Environment

During the last five years, the speakers gained experience in an enterprise environment on their DevOps missions. In two separate teams, they came across two similar problems.

At a first glance, the new freedom of tools, languages and frameworks felt liberating. But it quickly let to efficiency traps and unevenly distributed knowledge inside the team.

Secondly, most of the corporate templates for architecture documentation just did not suite their context and toolset. So instead of using long word documents, they started to experiment with a docs as code approach.

You will learn from first-hand experience how to prevent these problems, no matter if you are just starting out with DevOps, or if you are already getting your hands dirty. We will share some tool tips with you to increase your speed and comfort on the track with DevOps.

Speakers

Johannes Dienst

Developer Advocate, askui

Johannes Dienst is Developer Advocate at askui. His focus is on automation, documentation, and software quality.

journey of two devopsteams in a corporate environment pdf

Thursday February 18, 2021 5:00pm - 5:40pm CET
Session Room 4

Agile DevOps, Talk

5:45pm CET

OpenShift multi-cluster management with RH ACM

OpenShift adoption keeps growing every day and with that new challenges arise.

One of those challenges is managing your multiple OpenShift clusters across your different infrastructures, whether they are on-prem or cloud-based.

In this session, we will have an overview on Red Hat Advanced Cluster Management for Kubernetes where we will discover how it can help us to manage our Kubernetes clusters across the globe.

We will go through the 4 main pillars of RH ACM:

- Cluster Lifecycle: We will see how we can deploy/update/manage OpenShift clusters with RH ACM.
- Observability: We will see how we can consume metrics/data from multiple clusters from a single point.
- Application Lifecycle: We will see how we can deploy and manage applications across our cluster fleet.
- Policy and Governance: We will see how we can define policies and configurations so our clusters are compliant with the security best practices at our company.

The audience can expect to get:

- Some of the challenges you will face when dealing with multiple clusters management.
- Basic understanding of multi-cluster observability and how it can help to diagnose issues / understand the current state of your infrastructures.
- Basic notions of GitOps and how we can use GitOps to deploy applications to our cluster fleet.
- Basic notions around policies and how you can use them to be compliant with the security standards at your company.

Speakers

Mario Vazquez

Principal Software Engineer, Red Hat, Inc.

Software Engineer at Red Hat, passionate about automation, containers and hybrid cloud.

Andres Valero

Specialist Solution Architect, Red Hat

OpenShift multi cluster management with RHACM pdf

Thursday February 18, 2021 5:45pm - 6:25pm CET
Session Room 4

Cloud and Containers, Talk

9:45am CET

Formula 1 telemetry processing using Kafka Streams

Apache Kafka is the de facto data streaming platform used for ingesting vast amounts of data and processing them in real-time. Low latency analytics are vital if users are to react to events as fast as possible and to effectively shape future decision making. The Apache Kafka upstream community provides the Kafka Streams library for simplifying the development of highly scalable applications for filtering, mapping, transforming, and enriching these data.
During this session, we will explore how we can use Kafka Streams to help a Formula 1 team gain insights during a race. The 'real' data will come from a well-known racing game and will be processed by our application in real-time, making us feel like real track-side F1 engineers!

Speakers

Paolo Patierno

Principal Software Engineer, Red Hat

Paolo is a Principal Software Engineer working for Red Hat on the messaging and IoT team. He has been working on different integration projects having AMQP with Apache Kafka and Spark and on the EnMasse messaging-as-a-service project about the integration with MQTT. Currently, he... Read More →

Thomas Cooper

Senior Software Engineer, Red Hat

Tom has various Computer Science degrees and a PhD in Distributed Streaming Systems. His research focused on performance modelling of streaming systems like Apache Storm and involved digging deep into the nuts and bolts of big data stream processing. He even got to work at Twitter... Read More →

Friday February 19, 2021 9:45am - 10:10am CET
Session Room 4

Developer Tools, Talk

10:30am CET

Deploy covid-19 app using devfile on OpenShift 4.x

OpenShift Connector for VS Code brings the power and convenience of Kubernetes and Red Hat OpenShift to developers. With the extension, users can create a new Kubernetes or OpenShift application, deploy it to a local or remote cluster in seconds and simplify the developer experience.
The VSCode extension allows you to create, connect, deploy, debug components without leaving your IDE and breaking your development flow. Easily deploy the code directly to Kubernetes or OpenShift using devfile deployment method.

Through the demo presented, you'll see how to deploy a Covid-19 Tracker React Application on OpenShift 4.x. The example uses the following scenarios for the end-to-end scenario:
- Create a local instance of OpenShift using Red Hat CodeReady Containers within VSCode
- Devfile Integration : Deploy the Covid-19 tracker application using NodeJS devfile component
- Deploy Operator-backed services : Operators provide custom resource definitions (CRDs), which you can use to create service instances and link to components
- Integrated Debugging and Log Viewing/Streaming of the component
- Explore all the Kubernetes resources such as Build Configs, pods from IDE

This lets you write, build, and deploy applications entirely on Kubernetes or OpenShift and bringing iterative development and deployment flows directly to developers.

Github: https://github.com/redhat-developer/vscode-openshift-tools
VSCode Marketplace: https://marketplace.visualstudio.com/items?itemName=redhat.vscode-openshift-connector

Speakers

Mohit Suman

Senior Product Manager, Red Hat

Mohit Suman is based out of beautiful country India. He works as a Senior Technical Product Manager at Red Hat, Developer Experience. He holds experience in Product Management, Software Engineering and Architecture in fields ranging from large-scale distributed computing and developer... Read More →

Friday February 19, 2021 10:30am - 10:55am CET
Session Room 4

Developer Tools, Talk

11:30am CET

PowerShell Core 101

PowerShell is a task automation and configuration management framework from Microsoft, consisting of a command-line shell and the associated scripting language. Initially a Windows component only, known as Windows PowerShell, it was made open-source and cross-platform on 18 August 2016 with the introduction of PowerShell Core. PowerShell Core concepts are different in many cases compare to classical Linux shells. This sessions is introduction to PowerShell Core. You will learn: How to install on your Linux distribution Basic concept of Object pipe Basics of PowerShell scripting language How to get help Security“ script signing

Speakers

Stepan Bechynsky

Azure Technical Trainer, Microsoft

Stepan started as freelance developer and trainer 1995. In 2006 Stepan joined Microsoft as Technical Evangelist at Czech Republic. After nine years he left Microsoft to start working as European Cloud Team Lead at pharmaceutical company MSD. He spent in pharma industry one and half... Read More →

Friday February 19, 2021 11:30am - 12:10pm CET
Session Room 4

Developer Tools, Talk

12:15pm CET

Generating Kubernetes manifests using Dekorate

Dekorate offers a collection of Java annotations and processors aiming to generate during the compilation of your application the Kubernetes manifests. It will make your life even easier because you will not even need to edit the JSON and YAML files. You can use java annotations, an application.properties file or combine both for customizing the manifests. In this session I will show you how easy is to create the Kubernetes manifests to deploy your microservices in the container platform with Dekorate, don't miss it!

Speakers

Aurea Munoz Hernandez

Software Engineer, Red Hat

I am a software developer since 2005 and I develop Java web applications. I live in Madrid and I'm working for Red Hat with a focus on the integration of Spring Boot & Spring Cloud technology within the Red Hat Middleware portfolio. I contribute to existing or new Spring(Boot... Read More →

Generating Kubernetes manifests using Dekorate pdf

Friday February 19, 2021 12:15pm - 12:40pm CET
Session Room 4

Developer Tools, Talk

12:45pm CET

Apache Kafka as a Monitoring Data Pipeline

Observability and monitoring are important parts of every platform and any application running on top of it, but it often doesn't get the attention it deserves. Only, when the first problems arise, monitoring gets into focus. Then the first production issues often discover the weaknesses in the monitoring solutions: missing, lost or delayed data, reliability issues, etc. When things go wrong, the monitoring tool chain comes under more pressure and issues become visible. Apache Kafka is often advertised as a streaming platform, but delivering logs and metrics is one of the most common use cases. This talk will explain the main advantages of using Kafka as a data pipeline in monitoring applications such as Fluentd or Jaeger tracing. A demo will show how to use Strimzi CNCF project to set up this stack.

Speakers

Jakub Scholz

Senior Principal Software Engineer, Red Hat

Jakub works at Red Hat as Senior Principal Software Engineer. He has long-term experience with messaging and currently focuses mainly on Apache Kafka and its integration with Kubernetes. He is one of the maintainers of the Strimzi project which provides tooling for running Apache... Read More →

Apache Kafka as a Monitoring Data Pipeline pdf

Friday February 19, 2021 12:45pm - 1:25pm CET
Session Room 4

Developer Tools, Talk

2:45pm CET

Project Shipwright: Build Container Images on k8s

Over the past decade, we have witnessed a dramatic shift in how developers package and deploy their source code. With Kubernetes, the unit of delivery has shifted from compiled binaries and scripts to container images. Assembling container images is not a simple task, requiring deep knowledge of a growing set of tools and technologies. This task becomes even more difficult when trying to replicate these processes on Kubernetes.

In this session we will introduce Shipwright, a vendor-neutral project that provides a framework for building container images on Kubernetes. We will discuss the origins of Shipwright, the objectives of the project, and demonstrate how Shipwright can be used to build a container image on Kubernetes using a wide variety of tools. Shipwright currently powers container image builds on IBM Code Engine, and will be used as the basis for OpenShift Builds v2 in the near future.

Speakers

Adam Kaplan

Principal Software Engineer, Red Hat, Inc.

Adam Kaplan (he/him/his) is a software engineer at Red Hat, a maintainer of the Shipwright and Tekton projects, and CD Foundation Governing Board member. He currently leads efforts at Red Hat to simplify hybrid cloud application development, and previously maintained developer-focused... Read More →

Shoubhik Bose

Senior Principal Software Engineer, Red Hat, Inc

Project Shipwright Devconf.cz 2021 pdf

Friday February 19, 2021 2:45pm - 3:25pm CET
Session Room 4

Developer Tools, Talk

3:30pm CET

Ansible with Edge Computing

In this meeting, we will go over the Edge Computing architecture and explain how Edge Computing can be applied with Ansible. Ansible is a mature open source project that supports automation across multiple platforms. Ansiable also provides many layer of deployment security. The Edge Computing Data Centers can be built on top of Openshift leveraging different Openshift Features such as Cloud Native Tool Kits, Microservice Framework and container technologies. Edge Computing Operational Manager can communicate with different Edge Computing Running Platform. The Edge Computing Running Platform would have an external gateway that hooked up to the 3Scale API management. All these different deployments can be orchestrated using Ansible. We will go over a few examples and get your started using Ansible Edge Computing.

Speakers

Ip Sam

Architect, Redhat

Red Hat Architect

Ansible With Edge Computing pdf

Friday February 19, 2021 3:30pm - 3:55pm CET
Session Room 4

Developer Tools, Talk

4:30pm CET

Containerize your Java Applications using Eclipse JKube

In this presentation you'll get to know a little bit more about Eclipse JKube and how to get started to integrate it in your Java Project. JKube brings your Java applications closer to Kubernetes and OpenShift by providing the tools you need to easily deploy them into the cloud.

The presentation starts with a very brief introduction to the project and a description of its main features.

After this very brief introduction we'll get hands-on with the project and start a live demonstration. A random popular GitHub hosted Java project will be cloned and quickly configured to use Eclipse JKube. Following the initial project setup, the application will be deployed both to an OpenShift and a Kubernetes cluster. Additional JKube features such as in-cluster debugging or log inspection will also be showcased using this project.

Recently developed features and next steps will also be shared at the end of the presentation.

Speakers

Marc Nuri

Senior Software Engineer, Red Hat

Marc is an open-source enthusiast and software developer. Currently Marc is working as a senior software engineer at Red Hat in the Developer Tools team focusing on Java. He is leading the development efforts of Eclipse JKube and is part of the core maintainer team for Fabric8 Kubernetes... Read More →

Friday February 19, 2021 4:30pm - 4:55pm CET
Session Room 4

Developer Tools, Talk

5:00pm CET

Building High Quality OpenShift Applications

In this meeting, we will go over examples of high quality OpenShift applications including best development practices, guidelines and principles. We will review some commonly made mistakes, and talk more ways to solve them. We will review metrics that we could use to track quality in our application, and look at different ways to measure the performance and quality.

Speakers

Ip Sam

Architect, Redhat

Red Hat Architect

HighQualityOpenshiftApplications pdf

Friday February 19, 2021 5:00pm - 5:25pm CET
Session Room 4

Developer Tools, Talk

5:30pm CET

Power-Up With Clouds And Pipelines

Mario is a software developer. He used to send his code to Luigi so he could deploy them on their servers, but he now want to embrace DevOps principles and work together with Luigi to deploy his applications. This is where Tekton Pipelines will come to help them. Tekton is a flexible, Kubernetes native open source CI/CD framework that enables automating deployments across multiple platformsโ€”including Kubernetes, serverless, and VMsโ€”by abstracting away the underlying details. During this talk, the attendees will learn some of the Tekton resources, how to install the pipelines and how to run them for building, testing and deploying containerized applications on their clusters.

Speakers

Joel Lord

Developer Advocate, MongoDB

Joel Lord is passionate about the web and technology in general. He likes to learn new things, but most of all, he wants to share his discoveries. He does so by travelling at various conferences all across the globe. He graduated from college in computer programming in the last millennium... Read More →

Friday February 19, 2021 5:30pm - 6:10pm CET
Session Room 4

Developer Tools, Talk

9:45am CET

A different flavor of the distributed transaction

Transactions are one of the most complex and yet very important areas of computing. They can get particularly hard when the system moves to the distributed environments as almost every component in the distributed system is liable to failures. Traditional locking protocols, used in transaction solutions today, are then very prone to holding locks on resources for unnecessarily long periods. The saga pattern provides an alternative non-blocking solution with the design that allows individual parts of the transaction to commited immediately and independently. This design is specifically suitable for long running transactions and distributed systems. In this session, we will present a newly created MicroProfile specification called Long Running Actions (LRA) which provides a definition of the transactional protocol and a simple API for the distributed transactions in the Java microservices environment based on the saga pattern. We will show you why the saga pattern is a very suitable transactional solution for many distributed microservices applications and demonstrate the usage of the LRA specification with the live coded demo.

Speakers

Martin Štefanko

Senior software engineer, Red Hat

a software engineer working mainly on Red Hat middleware runtimes technologies like WildFly / JBoss EAP application servers, Thorntail, Quarkus and individual components that are included in these projects like RESTEasy, Weld or Hibernate. He is also actively participating in MicroProfile... Read More →

Saturday February 20, 2021 9:45am - 10:25am CET
Session Room 4

Microservices, Talk

10:30am CET

UPt! Your Provisioning of Linux Machines

Red Hat contributes to stabilizing the upstream linux kernel using its
Enterprise class hardware.
For many years, this hardware has been managed through Beaker,
a software for managing and automating labs of test computers.
Beaker is used often, especially when testing different architectures and
special hardware.

However, in some cases, for example when generic x86_64 devices suffice,
it is easier to take advantage of stable infrastructure of different providers.
As providers appear, evolve their services and add non x86_64 hardware, it is
suddenly important to be able to target these providers for running tasks
and scaling-out testing.

Once a system is up, a testing harness called restraint allows a relatively
low-level and lightweight way to ensure tasks are executed. Beaker has used
this harness for some time. It fulfills important requirements for kernel testing,
such as being reliable, being able to handle machine reboots and more.

Because of CKI (Continuous Kernel Integration) team's need to scale-out to
different providers, I've written UPT project (Unified Provisioning Tool) to
simplify provisioning and Restraint Test Runner take advantage of restraint
for (kernel and non-kernel) test running.

This talk will briefly discuss current approach towards running (primarily
kernel) tests and targeting different cloud providers.

This talk will discuss the aforementioned tools, their capabilities and features
to execute tasks, run tests, simplify test result interpretation, provide partial
test results and handle unexpected behavior.

As different tools in the open source community evolve as well, we will also
discuss possible cooperation, as this is already on the radar.

We encourage you to share and invite people who might be interested; this talk
is suitable for anyone in kernel testing/tools, CI and related topics.

Speakers

Jakub Racek

Software Engineer, Red Hat

Red Hat software engineer, former kernel maintainer

Saturday February 20, 2021 10:30am - 10:55am CET
Session Room 4

Quality / Testing, Talk

11:30am CET

Automation anywhere by customising Robot Framework

Robot Framework is a generic open-source, Python-based, widely used automation framework. It is an extensible keyword-driven automation framework, useful for acceptance testing, acceptance test-driven development (ATDD), behavior-driven development (BDD), and robotic process automation (RPA). It can be used in distributed, heterogeneous environments, where automation requires using different technologies and interfaces. In the Robot Framework and all of its libraries, they really do a lot to meet the versatile product requirements.

In case to meet specific functionality that does not handle in the existing libraries of robot framework, customizing robot framework is the best solution. This proposal will brighten the idea of how to use the Robot framework to make automation possible for any product. It will cover the generic design to extend the Robot framework which will be applicable for any custom product. The concept of extending the existing framework will support non-existing functionality from a wide range of products and provides the feasibility of test automation in any project. The demo example will add a practical implementation of the presented design. Also, this demo will cover how to create a project-specific framework using the Robot framework.

The talk will initially cover the revisionary introduction of the robot framework however detailed basics of the Robot framework will be out of scope. The major focus will be on the understanding of how to design and implement robot framework customization, to make the test automation feasible for any custom product.

Speakers

Suchita Gatfane

SQE, Red Hat

I am based out of Redhat Pune as a software quality engg. In Red Hat, I have worked on projects such as REDHAT GLUSTER STORAGE and am currently working on OPENSHIFT CONTAINER STORAGE. I am having 2 years of experience in VM kernel development and 8 years of experience in automation... Read More →

Saturday February 20, 2021 11:30am - 12:10pm CET
Session Room 4

Quality / Testing, Talk

12:15pm CET

Failover from OpenStack to AWS in BaseOS CI/TFT

The Testing Farm Team has been fighting an unreliable testing infrastructure for years. Last year we got the possibility of a failover from our internal OpenStack cloud to the public cloud (AWS). In this session, we will guide you through our journey of adding a transparent failover functionality to our CI pipeline, with AWS complementing an OpenStack tenant. We will look at the requirements on the AWS cloud in terms of connecting it to the internal infrastructure and at the main features of our open-source provisioner Artemis, which made this failover possible: a programmable routing mechanism. In addition to the actual provisioning, Artemis also shields us from short-lasting infrastructure outages, which previously caused irrecoverable failures in our CI pipeline. In the end, we would like to show you how you can take advantage of Artemis for your own benefit and share the interim plans of adding support for additional public clouds, Beaker, and more features planned by the team.

Speakers

Milos Prchlik

Red Hat

Evgeny Fedin

Software QE, Red Hat

Miroslav Vadkerti

Senior Prinicipal Quality Engineer, Red Hat

I work on Continuous Integration for RHEL. I am the co-author of https://github.com/gluetool/gluetool and Testing Farm.

Devconf 2021 Failover from OpenStack to AWS in BaseOS CI TFT pdf

Saturday February 20, 2021 12:15pm - 12:40pm CET
Session Room 4

Quality / Testing, Talk

12:45pm CET

openQA - end to end testing of operating systems

openQA is an integration testing framework for whole operating systems. It can perform the same actions that a human would perform when interacting with the system under test, only in an automated fashion. This can be leveraged to continuously test repetitive tasks, like the installer of a Linux distribution. It is used extensively by both the Fedora and openSUSE distributions to continuously test their latest images.
This talk will introduce openQA, its basic concepts, where it is applicable and showcase some more advanced features (like testing on bare metal hardware). It is aimed at beginners and will show you whether it is applicable for your use case while also providing some instructions how to start using it.

Speakers

Dan Čermák

Software Engineer Development Tools, SUSE

Dan joined SUSE to work on development tools as part of the developer engagement program, after working on embedded devices. Currently he is maintaining the openSUSE vagrant boxes, vagrant and creates the Open Build Service Connector, an extension for Visual Studio Code that integrates... Read More →

Saturday February 20, 2021 12:45pm - 1:25pm CET
Session Room 4

Quality / Testing, Talk

2:00pm CET

Effective API schemas testing

Many modern Web applications use API schemas to describe their contracts. But, the presence of a schema doesn't mean that the real application works as defined in the schema. There are many reasons for that - from the fundamental inability to describe the application with the chosen schema spec to the ubiquitous human factor. There are many consequences from this problem, and the application crash is the least dangerous of them.
I will talk about Schemathesis - a tool that helps to solve many of these problems with property-based testing.
We'll go through typical use-cases and talk about stateful testing - an approach that allows you to generate whole sequences of API calls automatically.
You'll learn how to test API schemas with minimal efforts and create effective test scenarios that will make your applications more reliable.
If you are interested in the practical usage of property-based testing and how to implement it in real-life projects, I am keen to see you at the session!

Speakers

Dmitry Dygalo

Software Engineering Consultant

Building stuff with Python & Rust. Ask me about property-based testing, Rust, and web development.

Saturday February 20, 2021 2:00pm - 2:25pm CET
Session Room 4

Quality / Testing, Talk

2:30pm CET

Building Petabyte Scale ML Models with Python

Abstract

Although building ML models on small/ toy data-set is easy, most production-grade problems involve massive datasets which current ML practices don't scale to. In this talk, we cover how you can drastically increase the amount of data that your models can learn from using distributed data/ml pipes.

It can be difficult to figure out how to work with large data-sets (which do not fit in your RAM), even if you're already comfortable with ML libraries/ APIs within python. Many questions immediately come up: Which library should I use, and why? What's the difference between a 'map-reduce' and a 'task-graph'? What's a partial fit function, and what format does it expect the data in? Is it okay for my training data to have more features than observations? What's the appropriate machine learning model to use? And so on...

In this talk, we'll answer all those questions, and more!

We'll start by walking through the current distributed analytics (out-of-core learning) landscape in order to understand the pain-points and some solutions to this problem.

Here is a sketch of a system designed to achieve this goal (of building scalable ML models):

1. a way to stream instances
2. a way to extract features from instances
3. an incremental algorithm

Then we'll read a large dataset into Dask, Tensorflow (tf.data) & sklearn streaming, and immediately apply what we've learned about in last section. We'll move on to the model building process, including a discussion of which model is most appropriate for the task. We'll evaluate our model a few different ways, and then examine the model for greater insight into how the data is influencing its predictions. Finally, we'll practice this entire workflow on a new dataset, and end with a discussion of which parts of the process are worth tuning for improved performance.

Detailed Outline

1. Intro to out-of-core learning
2. Representing large datasets as instances
3. Transforming data (in batches) - live code [3-5]
4. Feature Engineering & Scaling
5. Building and evaluating a model (on entire datasets)
6. Practicing this workflow on another dataset
7. Benchmark other libraries/ for OOC learning
8. Questions and Answers

Key takeaway

By the end of the talk participants would know how to build petabyte scale ML models, beyond the shackles of conventional python libraries.

Participants would have a benchmarks and best case practices for building such ML models at scale.

Speakers

Vaibhav Srivastav

Data Scientist, Deloitte GmbH

I am a Data Scientist and a Master's Candidate - Computational Linguistics at Universität Stuttgart. I am currently researching on Speech, Language and Vision methods for extracting value out of unstructured data.In my previous stint with Deloitte Consulting LLP, I worked with Fortune... Read More →

Saturday February 20, 2021 2:30pm - 2:55pm CET
Session Room 4

ML / AI / Big Data, Talk

3:00pm CET

Stateful Sessions for Intelligent Apps

Live audio transcription and other similar applications require stateful processing to support both multi-user sessions and dynamic scale-out. We can persist audio state with a Kafka kappa architecture, but that state must also be preserved across the OpenShift cluster boundary to user web clients. Fortunately, OpenShift's sticky sessions allow stateful sessions to be implemented without complicated custom configurations.

In this talk, Gage will explain how to convert your single user constrained application to support stateful sessions with any number of users. Using the power of OpenShift and Open Data Hub's data monitoring and streaming tools, a stateful architecture can be developed and managed easily. We will showcase a real-time audio transcription use case, including a Kafka streaming architecture, in a practical data science application.

Speakers

Gage Krumbach

AICoE FDE Intern, Red Hat

I have been an intern at Red Hat since summer 2020 and have been working on the Forward Deployed Engineers team inside the AI Center of Excellence.

Stateful Sessions for Intelligent Apps pptx

Saturday February 20, 2021 3:00pm - 3:25pm CET
Session Room 4

ML / AI / Big Data, Talk

3:30pm CET

Giving your user interfaces a voice

Microcopy is the text that makes up a large portion of the interfaces we surround us with.
Its power is often overlooked, but getting it right can actually make or break your UI.
Using consistent and clear labels and making sure you're using the words that fit the user best is just as important as the component you use or how you lay things out on the screen.
In this talk UX designer Andreas and content writers Shweta and Vendula will show how you can collaborate across professions to create great interfaces. While the UXD team works to understand the functionalities and to create the wire-frames, technical writers can join hands to provide intuitive Microcopy. They will also show both good and bad examples of writing in interfaces and will give you some simple tips on how you can become a better writer to improve your interfaces.

Speakers

Andreas Nilsson

UX designer, Red Hat

Andreas is a designer at Red Hat and works primarily on the Cockpit project. He's been a GNOME contributor for the past 10 years.

Vendula Ferschmannova

Technical Writing Manager, Red Hat

Technical communication professional passionate about minimalism, structure, and technical writing in general.My other area of interest is agile and helping people with their technical writing career.

Shweta Jalgaonkar

Associate Content Strategist, Redhat

I am a senior technical writer with RHEL platform team. I joined Red Hat a year ago.I have over 12 years of experience in product documentation for enterprise software products from varied domains.I am specialized in DITA-based authoring, information architecture, and UX writing... Read More →

Saturday February 20, 2021 3:30pm - 3:55pm CET
Session Room 4

Frontend / UI / UX, Talk