Shopping independently presents significant challenges for the visually impaired. I address these obstacles with a Python-based shopping assistance system that combines product and hand detection with speech recognition.
In this talk, we’ll explore how computer vision technology, leveraging a camera and pre-trained models, enables visually impaired users to navigate supermarkets and select products autonomously. By recognizing items and providing acoustic guidance for hand positioning, a self-driving system empowers users to shop without external help. Additionally, attendees will learn how voice commands, processed through speech recognition, create a user-friendly interface.
Network automation is important for efficient and reliable network management. In simple terms it means using software to automate tasks like configuring or testing network devices like routers or switches. This increases the speed of deploying new configurations while reducing human error - a win-win situation for everyone!
This talk introduces Netmiko, a powerful yet simple Python library for network automation. Netmiko provides an easy-to-use interface for SSH-based interactions with network devices. The syntax of Netmiko is easy to understand and it's vendor agnostic approach let's you automate all kinds of network devices.
I'll show you how Netmiko works, starting with building your inventory of devices and connecting to them. Then we'll move on to sending config commands and configuring the devices. Finally, we'll verify if the config was applied successfully.
By the end of this talk, you will be able to use Netmiko to automate your own network devices!
"Code Makeover: Mastering the Art of Python Refactoring" is your guide to transforming cluttered Python scripts into models of efficiency and readability.
Discover the art of refining your code without losing your mind in the process.
From recognizing the need for a makeover to applying the finishing touches, this talk will equip you with the strategies, best practices, and cautionary tales to ensure your code not only works beautifully but is also a joy to read and maintain.
asyncio is the de-facto standard for asynchronous programming in Python and enables concurrent operations without using threads or processes.
In this talk, we will delve into the technical details of asyncio and show how it can be used to improve the performance of Python applications. We will start by discussing the difference between threading, multiprocessing and async programming. Then, we will introduce the basic building blocks of asyncio: Event loops and Coroutines. We will dive deep into the way Coroutines work, discussing their origins and how they are linked to Generators.
Next, we will look at Tasks, which are a higher-level abstraction built on top of Coroutines. Tasks make it easy to schedule and manage the execution of Coroutines. We will cover how to create and manage Tasks and how they can be used to write concurrent code.
Finally, we will also cover some more advanced topics such as Async Loops and Context Managers, and how to handle errors and cancellations in asyncio.
Whether you are new to asyncio or have experience with it, this talk will provide valuable insights and tips for leveraging its full potential. By the end of this talk, you will have a better understanding of how asyncio works, and how to use it to create efficient, high-performing Python applications.
It is common knowledge that floating point numbers (float
) are tricky. When misused, floats may lead to construction disaster - I will mention some notable accidents. But mainly, I will dig the topic from Python interface and rounding methods (int
vs round
, divmod
, math
library), via special symbols (NaN
, Inf
, -0.0
, …), invoking different processor modes (FLT_ROUNDS
), down to the bits of IEEE 754 standard.
Python has never been good at parallel computing. Multi-threading doesn't scale beyond a handful of threads because of the notorious GIL (Global Interpreter Lock). Multi-processing feels like a cumbersome workaround that increases complexity and overhead. And yet, we're firmly in an era of multicore machines, big data, and massive ML models that require all the compute they can get. Python, otherwise the star of data science and ML, doesn't really shine when it comes to parallel workloads.
But that is finally changing! There is more focus and progress happening in this area than ever before, and promising leaps forward are on the horizon. Subinterpreters, already merged in 3.12, offer a tentative step towards making the GIL less than "global"... while the free-threading build of Python 3.13 offers a path towards removing the GIL entirely in the future. Let's explore these new developments and look at how they work, what they do and do not solve, and how we can take advantage of them.
Pydantic, a powerful data validation library in Python, has emerged as a cornerstone for ensuring data integrity and reliability in modern software projects.
This talk will delve deep into the world of Pydantic and explore its myriad capabilities for data validation in Python applications. We will begin by understanding the core concepts of Pydantic, including data models, fields, and validators. With Pydantic's declarative approach, developers can effortlessly define the structure of their data and specify validation rules, ensuring that inputs meet the expected criteria.
One of Pydantic's key strengths lies in its ability to handle complex data structures with ease. We will explore how Pydantic supports nested models, allowing developers to represent hierarchical data relationships effectively. Additionally, Pydantic provides support for custom validators, enabling developers to implement domain-specific validation logic tailored to their application's requirements
We will explore advanced features of Pydantic, including data parsing and serialization. Pydantic's ability to automatically convert input data to Python objects and vice versa simplifies the handling of data from various sources, such as JSON payloads or database records. We will also discuss strategies for handling errors and exceptions gracefully when working with Pydantic, ensuring that applications remain resilient in the face of unexpected inputs.
We will illustrate Pydantic's usage through practical examples and real-world scenarios. From validating user inputs in a web application to processing data from external sources, Pydantic empowers developers to write cleaner, safer, and more maintainable code. By embracing Pydantic as a fundamental tool for data validation, Python developers can enhance the reliability and integrity of their applications while reducing development time and effort.
The website of the Hypothesis project boldly asserts:
Normal “automated” software testing is surprisingly manual. Every scenario the computer runs, someone had to write by hand. Hypothesis can fix this.
While it's debatable whether property-based testing should fully replace the manual parametrization of tests with different inputs and outputs, there's no doubt that Hypothesis is a powerful tool for uncovering bugs nobody would even have considered looking for. In fact, during its development, the authors of Hypothesis accidentally discovered countless bugs in CPython and libraries, thus coining the term "The Curse of Hypothesis".
The framework, although incredibly powerful, might seem overwhelming at first. In this talk, I will demonstrate how even simply throwing random strings at functions can reveal surprising bugs. From there, we'll progress towards generating more complex data, which will be less daunting than it initially appears. You'll also see how Hypothesis seamlessly integrates with various ecosystems and can be a valuable tool in any developer's toolkit.
We are happy to announce Lightning Talks to this year's conference again! They are open to everyone 😊
How it works:
– You can register directly at the conference. First come, first served.
– Any proposal is welcome, as long as your talk has something to do with Python and respects our Code of Conduct. We reserve the right to reject talks.
– Talk time is strictly limited to 5 minutes.
– To keep turnaround times short, you will not be able to plug in your own device. Please submit your slides as PDF via email at least 30 minutes before Lightning Talks start. We will provide a laptop with all slides.
– By registering, you accept that your talk may be recorded, published and streamed live (audio & video) under Creative Commons Attribution 4.0 International license.
In the rapidly evolving field of Artificial Intelligence (AI), the
importance of understanding model decisions is becoming increasingly
vital. This talk explores why explanations are crucial for both
technical and ethical reasons. We begin by examining the necessity of
explainability in AI systems, particularly in mitigating unexpected
model behavior, biases and addressing ethical concerns. The discussion
then transitions into Explainable AI (XAI), highlighting the
differences between interpretability and explainability, and
showcasing methods for enhancing model transparency. A real-world
examples will demonstrate how these concepts can be practically
employed to improve model performance. The talk concludes with
reflections on the challenges and future directions in XAI.
While we have witnessed spectacular advancements in Machine Learning over the past months and years, robustness of results and establishment of causal relations remain lacking. During this talk we will walk you through an example of using causal Machine Learning techniques to estimate causal, heterogeneous treatment effects with Open Source Python tooling.
Learning causal relationships - in contrast to mere correlations - is of great importance for many applications where we'd like to learn how to intervene with the real world: To whom should we administer which medical drug? To whom should we offer a marketing voucher? For which automated processed should we make a human expert intervene? In such situations we'd like to rest assured our decision don't just rely correlations - potentially tainted by common confounders. Rather, we'd like to make causal statements about the heterogeneous effects of administering a treatment.
In terms of making this happen, the field of Causal Inference has been able to incorporate progress from Machine Learning in theory. Yet, in practice, applications remain challenging: tooling is still somewhat immature and little examples to follow exist. Therefore, we would like to walk you through a case study of estimating causal, heterogeneous treatment effects with Open Source Python tooling.
Schedule:
- Conceptual introduction to causal effect estimation [5']
- Conceptual introduction to MetaLearners [5']
- Comparison of Python tooling for Conditional Average Treatment Effect Estimation [5']
- Practical case study based on experiment data [10']
- Q&A [5']
Requirements:
- Fundamental knowledge about Statistical Learning or Machine Learning
- Fundamental knowledge about Python
After this session, a participant can expect to have an idea as to what causal, heterogeneous treatment effect estimation is about and how to approach it in Python.
Recently, quantum machine learning algorithms have become popular due to a drastic increase in the power of quantum computation. Analysis of images with 10^54 pixels, easy encoding of Fourier series-like data, generation of novel chemical molecules - and all that with a couple of Python code lines! It's only left to choose a framework for trying out next-level deep learning models ... but which one? The "grandfather" of Python quantum computing packages, PennyLane, with tons of user-friendly tutorials - or maybe a Qiskit, which runs naturally on IBM quantum computers, and moreover in February 2024 got a first major release?
The answer is not that obvious, and together, we'll look at the pros and cons of both via training quantum circuits, assessing compatibility with popular Python machine learning packages - and all that on examples of real-world problems from financial and natural sciences.
XGBoost is considered a state-of-the-art model for regression, classification, and learning-to-rank problems on tabular data. Unfortunately, tree-based ensemble models are notoriously difficult to explain, limiting their application in critical fields. Techniques like SHapley Additive exPlanations (SHAP) and Explainable Boosting Machine (EBM) have become common methods for assessing how much each feature contributes to the model prediction.
This talk will introduce SHAP and EBM, explaining the theory behind their mechanisms in an accessible way and discussing the pros and cons of both techniques. We will also comment on Python snippets where SHAP and EBM are used to explain a gradient boosting model.
Attendees will walk away with an understanding of how SHAP and EBM work, the limitations and merits of both techniques, and a tutorial on how to use these methods in Python, courtesy of the shap and interpret-ml packages.
Talk outline:
- A brief reminder about gradient boosting and XGBoost (5 mins)
- The challenge of explainability (5 mins)
- EBM: theory and applications (10 mins)
- SHAP: theory and applications (10 mins)
Python developers and data enthusiasts attending the Swiss Python Summit, this talk is for you! Apache Spark is a powerful framework often used alongside Python for big data processing. You've seen its capabilities, but what powers its impressive performance?
Join me, Neil Gibbons, a Backend Engineer with a passion for distributed systems (and a recent MSc in Data Science!). I've also delivered talks at DevFest Mons 2022 and Birkbeck University.
In this session, we'll delve into the internal workings of Spark. We'll explore concepts like Resilient Distributed Datasets (RDDs), which are fundamental to Spark's fault tolerance. We'll see how Spark distributes tasks across a cluster, leveraging Python's strengths in parallel processing. Finally, we'll uncover the secrets of in-memory computations, the key to Spark's blazing speed.
Why attend? Gaining a deeper understanding of Spark's internals, especially within the Python ecosystem, empowers you to:
Optimize your Python big data applications for peak performance.
Troubleshoot issues more efficiently.
Write effective Spark code that unlocks its true potential and complements your Python expertise.
Whether you're a data scientist, developer, or simply curious about big data, this talk will bridge the gap between Python and Spark. Join me as we explore Spark's inner workings!
Join us on two Vision-Language Adventures!
We'll uncover the information hidden inside big image collections with Vision-Language Models (VLMs) showing us the way. Who knows which forgotten gems await us?
In the first part, we'll use CLIP and FAISS to go on a treasure hunt in your photo collection. You'll learn how to filter through millions of images in a breeze, using natural language. Bye-bye endless scrolling, hour-long tagging, and frustrated folder searching 👋
In the second part, we will harness the power of VLMs to help us caption images - translating pixels to words. Then we'll apply simple NLP methods like Bag-of-Words and clustering to reveal deep insights into your photo collections.
By the end of this talk, you'll be equipped with the knowledge and tools to unlock new insights, identify patterns, and make your image data work harder for you.
This talk is for an intermediate audience - it is good if you bring some knowledge in Computer Vision, NLP or just general Deep Learning.
The talk will be structured as follows:
- 5min - What are VLMs?
- 10min - Image Search with CLIP and FAISS
- 10min - Analysis of Captioned Images
- 5min - Possible Applications, Closing Thoughts
In the pursuit of classifying train stations from Open Railway Maps data, for Europe's largest rail cargo company. Initially, the project focused on developing a robust deep learning framework, which required extensive manual labeling of images to train the model effectively. Recognizing the impracticality and time-consuming nature of manual labeling, we conceptualized an approach to expedite the labeling process using cluster algorithms and graph information.
Our method involved an automated labeling algorithm, which significantly accelerated the annotation phase. This algorithm demonstrated remarkable efficiency, automatically labeling images with high accuracy, thereby drastically reducing the manual effort involved.
During the implementation, we discovered that our automated labeling algorithm was, in itself, the comprehensive solution for the classification task we aimed to address. This realization highlighted that our initial objective of deploying a deep learning model could be achieved through “classic” means.
In conclusion, our project unveiled that the automated labeling algorithm was not just a tool to facilitate deep learning, but an effective standalone solution in itself. This unexpected outcome emphasizing that sometimes, the journey towards deep learning can reveal simpler, yet equally powerful, solutions.
Damn, as all data scientists deep down, we wanted to take advantage of some sexy deep learning and ended up with a great, but not so sexy core data science solution.
Retrieval Augmented Generation (RAG) has been used to mitigate hallucination issues from LLMs and rapidly provide LLMs with external knowledge that were not part of the pre-training data. While tutorials offer convenient ways to build POCs quickly, transitioning these prototypes to production environments often catches us off-guard with unforeseen challenges. This talk takes a deeper dive into the topics that are often missing from cookbooks and tutorials yet are crucial in scaling your RAG prototype to production. Our discussion will use real examples to help you better understand some of the best practices in production RAG for observability, security, scalability, and fault tolerance.
We are happy to announce Lightning Talks to this year's conference again! They are open to everyone 😊
How it works:
– You can register directly at the conference. First come, first served.
– Any proposal is welcome, as long as your talk has something to do with Python and respects our Code of Conduct. We reserve the right to reject talks.
– Talk time is strictly limited to 5 minutes.
– To keep turnaround times short, you will not be able to plug in your own device. Please submit your slides as PDF via email at least 30 minutes before Lightning Talks start. We will provide a laptop with all slides.
– By registering, you accept that your talk may be recorded, published and streamed live (audio & video) under Creative Commons Attribution 4.0 International license.