Telegram Group Search
๐Ÿ“š Data Science Riddle

You want to detect extreme values visually in one plot. Which one is best?
Anonymous Quiz
53%
Box plot
29%
Heatmap
10%
Line chart
7%
Area plot
Mining of Massive Datasets (Leskovec, Stanford).pdf
2.9 MB
The Big Data bible from Stanford: MapReduce, Spark, recommendation systems, PageRank, locality-sensitive hashing, Large scale machine learning and mining social networks/streams all explained clearly with real algorithms you can code today. 500 pages of pure gold.
โค3
If you want to become a Data Scientist, this is the path to follow.
๐Ÿ‘5
๐Ÿ“š Data Science Riddle

You want to prevent inconsistent data across environments. What helps most?
Anonymous Quiz
31%
Checkpoints
18%
Contracts
39%
Indexes
12%
Sharding
๐Ÿ› ๏ธ Running Code in Jupyter Notebooks

Jupyter Notebooks let you write & run code interactively.
Hereโ€™s a quick guide to make your workflow smoother:

โ–ถ๏ธ Kernel & Code Cells
- Each notebook is tied to a single kernel (e.g. IPython).
- Code cells are where you write and execute code.

โŒจ๏ธ Useful Shortcuts
- Shift + Enter โ†’ run current cell, move to next
- Alt + Enter โ†’ run current cell, insert new one below
- Ctrl + Enter โ†’ run current cell, stay in place

๐Ÿ”„ Kernel Management
- Interrupt the kernel if code hangs.
- Restart kernel to reset memory & variables.

๐Ÿ–ฅ๏ธ Output Handling
- Results & errors appear directly under the cell.
- Long-running code outputs appear as theyโ€™re generated.
- Large outputs can be scrolled or collapsed for clarity.

๐Ÿ’ก Pro Tip:
Always โ€œRestart & Run Allโ€ before sharing or saving a notebook.
This ensures reproducibility and clean results.

๐Ÿ‘‰   Explore
โค2
๐Ÿ“š Data Science Riddle

You need fast reads of small files. What storage options fits best?
Anonymous Quiz
23%
Distributed FS
11%
Cold storage
21%
Object Storage
46%
Local SSD
โค4
6 Must-Know Data Engineering Tools For Beginners
โค2๐Ÿ‘1
๐Ÿ“š Data Science Riddle

A feature has low importance but domain experts insist it matters. What do you do?
Anonymous Quiz
27%
Encode it differently
21%
Scale it
11%
Drop the feature
41%
Check interaction effects
Advanced Data Science on Spark.pdf
1.8 MB
Covers Spark for ML, graph processing (GraphFrames), and integration with Hadoop from Stanford University.
โค4
๐Ÿ“š Data Science Riddle

Your estimate has high variance. Best fix?
Anonymous Quiz
56%
Increase sample size
28%
Change confidence level
9%
Reduce bin count
7%
Switch to bootstrap
The Difference Between Model Accuracy and Business Accuracy

A model can be 95% accurateโ€ฆ
yet deliver 0% business value.

Whyโ”
Because data science metrics โ‰  business metrics.

๐Ÿ“Œ Examples:
- A fraud model catches tiny fraud but misses large ones
- A churn model predicts already obvious churners
- A recommendation model boosts clicks but reduces revenue

Always align ML metrics with business KPIs.
Otherwise, your โ€œgreat modelโ€ is just a great illusion.
โค4
๐Ÿ“š Data Science Riddle

Your model's loss fluctuates but doesn't decrease overall. What's the most likely issue?
Anonymous Quiz
29%
Gradient exploding
39%
Weak regularization
20%
Small batch size
11%
Slow optimizer
โœ… Complete AI (Artificial Intelligence) Roadmap ๐Ÿค–๐Ÿš€ 

1๏ธโƒฃ Basics of AI 
๐Ÿ”น What is AI? 
๐Ÿ”น Types: Narrow AI vs General AI 
๐Ÿ”น AI vs ML vs DL 
๐Ÿ”น Real-world applications 

2๏ธโƒฃ Python for AI
๐Ÿ”น Python syntax & libraries 
๐Ÿ”น NumPy, Pandas for data handling 
๐Ÿ”น Matplotlib, Seaborn for visualization 

3๏ธโƒฃ Math Foundation
๐Ÿ”น Linear Algebra: Vectors, Matrices 
๐Ÿ”น Probability & Statistics 
๐Ÿ”น Calculus basics 
๐Ÿ”น Optimization techniques 

4๏ธโƒฃ Machine Learning (ML)
๐Ÿ”น Supervised vs Unsupervised 
๐Ÿ”น Regression, Classification, Clustering 
๐Ÿ”น Scikit-learn for ML 
๐Ÿ”น Model evaluation metrics 

5๏ธโƒฃ Deep Learning (DL)
๐Ÿ”น Neural Networks basics 
๐Ÿ”น Activation functions, backpropagation 
๐Ÿ”น TensorFlow / PyTorch 
๐Ÿ”น CNNs, RNNs, LSTMs 

6๏ธโƒฃ NLP (Natural Language Processing)
๐Ÿ”น Text cleaning & tokenization 
๐Ÿ”น Word embeddings (Word2Vec, GloVe) 
๐Ÿ”น Transformers & BERT 
๐Ÿ”น Chatbots & summarization 

7๏ธโƒฃ Computer Vision
๐Ÿ”น Image processing basics 
๐Ÿ”น OpenCV for CV tasks 
๐Ÿ”น Object detection, image classification 
๐Ÿ”น CNN architectures (ResNet, YOLO) 

8๏ธโƒฃ Model Deployment
๐Ÿ”น Streamlit / Flask APIs 
๐Ÿ”น Docker for containerization 
๐Ÿ”น Deploy on cloud: Render, Hugging Face, AWS 

9๏ธโƒฃ Tools & Ecosystem
๐Ÿ”น Git & GitHub 
๐Ÿ”น Jupyter Notebooks
๐Ÿ”น DVC, MLflow (for tracking models) 

๐Ÿ”Ÿ Build AI Projects
๐Ÿ”น Chatbot, Face recognition 
๐Ÿ”น Spam classifier, Stock prediction 
๐Ÿ”น Language translator, Object detector 
โค2๐Ÿ‘1
๐Ÿ“š Data Science Riddle - CNN Kernels

Which convolution increases channel depth but not spatial size?
Anonymous Quiz
6%
1x1 convolution
30%
3x3 convolution
47%
Depthwise convolution
17%
Transposed convolution
โค1
Normalization vs Standardization: Why Theyโ€™re Not the Same

People treat these two as interchangeable. theyโ€™re not.

๐Ÿ‘‰ Normalization (Min-Max scaling):
Compresses values to 0โ€“1.
Useful when magnitude matters (pixel values, distances).

๐Ÿ‘‰ Standardization (Z-score):
Centers data around mean=0, std=1.
Useful when distribution shape matters (linear/logistic regression, PCA).

๐Ÿ”‘ Key idea:
Normalization preserves relative proportions.
Standardization preserves statistical structure.

Pick the wrong one, and your modelโ€™s geometry becomes distorted.
โค4๐Ÿ‘1
Hey everyone ๐Ÿ‘‹

Tomorrow we are kicking off a new short & free series called:

๐Ÿ“Š Data Importing Series ๐Ÿ“Š

Weโ€™ll go through all the real ways to pull data into Python:
โ†’ CSV, Excel, JSON and more
โ†’ Databases & SQL databases 
โ†’ APIs, Google Sheets, even PDFs & web scraping

Short lessons, ready-to-copy code, zero boring theory.

First part drops tomorrow.
Turn on notifications so you donโ€™t miss it ๐Ÿ””

Whoโ€™s excited? React with a ๐Ÿ”ฅ if you are.
๐Ÿ”ฅ12โค2
Data science/ML/AI
Hey everyone ๐Ÿ‘‹ Tomorrow we are kicking off a new short & free series called: ๐Ÿ“Š Data Importing Series ๐Ÿ“Š Weโ€™ll go through all the real ways to pull data into Python: โ†’ CSV, Excel, JSON and more โ†’ Databases & SQL databases  โ†’ APIs, Google Sheets, even PDFsโ€ฆ
Click Me Load More a CSV file in Python

CSV stands for Comma-Separated Values the most common format for tabular data everywhere.
With pandas, turning a CSV into a powerful, queryable DataFrame takes just a few clear lines.

# Import the pandas library
import pandas as pd

# Specify the path to your CSV file
filename = "data.csv"

# Read the CSV file into a DataFrame
df = pd.read_csv(filename)

#Checking the first five rows
df.head()


Next up โžก๏ธ Click Me Load More an Excel file in Python

๐Ÿ‘‰Join @datascience_bds for more
Part of the @bigdataspecialist family
โค7
Data science/ML/AI
Hey everyone ๐Ÿ‘‹ Tomorrow we are kicking off a new short & free series called: ๐Ÿ“Š Data Importing Series ๐Ÿ“Š Weโ€™ll go through all the real ways to pull data into Python: โ†’ CSV, Excel, JSON and more โ†’ Databases & SQL databases  โ†’ APIs, Google Sheets, even PDFsโ€ฆ
Click Me Load More an Excel file in Python

Excel files are packed with headers, logos, merged cells, and multiple sheets but pandas handles it all.
With just a few extra parameters, you can skip junk rows, pick exact columns,e.t.c

# Import the pandas library 
import pandas as pd

# Specify the path to your Excel file (.xlsx or .xls)
filename = "data.xlsx"

# Read the Excel file into a DataFrame
# Common options you'll use all the time:
df = pd.read_excel(
    filename,
    sheet_name=0,              # 0 = first sheet
    header=0,                  # Row (0-indexed) to use as column names
    skiprows=4,                # Skip first 4 rows
    nrows=1000,                # Load only first 1000 rows
)
# Check the first five rows
df.head()


Next up โžก๏ธ Click Me Load More a text file in Python

๐Ÿ‘‰Join @datascience_bds for more
Part of the @bigdataspecialist family
โค4
๐Ÿ“š Data Science Riddle - Numerical Optimization

Which method uses second-order curvature information?
Anonymous Quiz
36%
SGD
20%
Momentum
34%
Adam
11%
Newton's method
๐Ÿ‘1
Data science/ML/AI
Hey everyone ๐Ÿ‘‹ Tomorrow we are kicking off a new short & free series called: ๐Ÿ“Š Data Importing Series ๐Ÿ“Š Weโ€™ll go through all the real ways to pull data into Python: โ†’ CSV, Excel, JSON and more โ†’ Databases & SQL databases  โ†’ APIs, Google Sheets, even PDFsโ€ฆ
Click Me Load More a text file in Python

Text files (.txt) are perfect for logs, books, raw notes, or any unstructured data.
With one clean line using pathlib, you can load an entire novel, log file, or dataset into a string
# Click Me Load More a text file in Python

filename = 'huck_finn.txt'                  # Name of the file to open

file = open(filename, mode='r')             # Open file in read mode ('r')
                                            # Use encoding='utf-8' if needed

text = file.read()                          # Read entire content into a string

print(file.closed)                          # False โ†’ file is still open

file.close()                                # Always close the file when done!
                                            # Prevents memory leaks & file locks

print(file.closed)                          # Now True โ†’ file is safely closed

print(text)                                 # Display the full text content


Next up โžก๏ธ Click Me Load More a JSON file in Python

๐Ÿ‘‰Join @datascience_bds for more
Part of the @bigdataspecialist family
โค3
2025/12/16 06:14:06
Back to Top
HTML Embed Code: