Add "Advanced Python DS Ecosystem" course materials#3
Conversation
|
@clstaudt we created some new content for a customer covering topics from oop, poetry, databases, polars, and dashboards. Would you be interested to have a look over the material and give some feedback before we merge? |
|
@ccauet Certainly. There might even be some thematic overlap with new material I am building. |
|
Polars material Not quite in the familiar form of notebooks from Data Science Learning Paths yet:
Technical:
Nice to have: Comparison of PySpark and Polars API - since they look very similar. |
|
Object Oriented Programming material
Really? A bit vague and misleading. Start with the idea of grouping data and logic together.
C++ and Java have access modifiers that enforce visibility rules. The conventions explained here are usually not called like that. Suggestion:
Also, this is not strictly correct:
Instead of:
... consider:
Nice to have: A more elaborate example where an OOP design really makes code elegant and easy to manage. For example the state machine design pattern -> check out https://github.com/clstaudt/cpp-patterns/blob/main/State/music.py Nice to have: A practical example for how OOP is used in a data science library. For example, scikit-learn Estimators and Transformers. Exercise: Build your own Estimator... |
|
1. Development of Python Packages with Poetry material missing or not linked in the TOC? |
|
Working with Databases ORM: SQLAlchemy
- "pyramid scheme" but "database schema"
NoSQL databases with PyMongo
Pandas + SQL(Alchemy) This explains pandas + SQL. If we are already using SQLAlchemy to interact with the DB, should I write raw SQL queries to read data into pandas or rather something like this? # Using the session in a with statement
with Session() as session:
# Inserting data
sample_users = [User(name="Alice", age=30), User(name="Bob", age=25), User(name="Charlie", age=35)]
session.add_all(sample_users)
session.commit()
# Querying data
users_query = session.query(User).all()
# Convert the query result to a pandas DataFrame
df = pd.DataFrame([(user.id, user.name, user.age) for user in users_query],
columns=["ID", "Name", "Age"]) |
|
streamlit Would be great to have a streamlit example here, but this particular demo may be too German for this repo... Nice to have: Demo that shows off a lot of the interactive stuff you can do with streamlit. |
…ro notebook to fit the usual fromat.
…nchmark' notebook. Update index accordingly.
No description provided.