Difference between revisions of "Jupyter"
| Line 119: | Line 119: | ||
<code> | <code> | ||
</code> | </code> | ||
| + | |||
| + | Up: | ||
| + | [[KENET Open OnDemand | Easy HPC access with KENET Open OnDemand]] | ||
Revision as of 08:27, 9 January 2026
Contents
JupyterLab (Web) Tutorial - KENET HPC Cluster
Overview
JupyterLab is an interactive web-based environment for notebooks, code, and data, ideal for data science, scientific computing, and machine learning workflows.
Use Cases:
- Interactive data analysis and visualization
- Machine learning model development and experimentation
- Creating reproducible research notebooks
- Teaching and sharing computational narratives
- Real-time data exploration with GPU acceleration
Access: Available through the KENET Open OnDemand web portal at https://ondemand.vlab.ac.ke
Prerequisites
Before using JupyterLab, ensure you have:
- Active KENET HPC cluster account
- Access to Open OnDemand portal
- Basic knowledge of Python, R, or Julia
- Data files stored in
/home/username/localscratch
Launching JupyterLab
Step 1: Access Interactive Apps
- Log into Open OnDemand: https://ondemand.vlab.ac.ke
- Click Interactive Apps in the top navigation menu
- Select JupyterLab from the dropdown list
Step 2: Configure Job Parameters
Fill in the job submission form with your requirements:
| Parameter | Description | Recommended Value |
|---|---|---|
| Partition | Queue for job execution | normal (CPU) or gpu (GPU tasks)
|
| Walltime | Maximum runtime in hours | 2 hours for testing, up to 192 for long jobs
|
| CPU Cores | Number of processor cores | 4-8 cores (adjust based on workload)
|
| Memory | RAM allocation | 16 GB for data science, 32 GB for large datasets
|
| Working Directory | Starting directory | /home/username or your project folder
|
Step 3: Submit and Wait
- Click Launch button
- Wait for job to start (Status: "Queued" → "Running")
- Click Connect to JupyterLab button when available (typically 30-60 seconds)
Quick Start Guide
Creating Your First Notebook
- Click File → New → Notebook or click the Python 3 tile in the Launcher
- Select kernel: Python 3, R, or Julia (if available)
- Start writing code in cells
Basic Cell Operations
Notebooks consist of cells where you can write and execute code. Here is a simple example to get you started:
# Code cell - Press Shift+Enter to run
import pandas as pd
import matplotlib.pyplot as plt
# Create sample data
data = pd.DataFrame({
'x': range(10),
'y': [i**2 for i in range(10)]
})
# Plot
plt.plot(data['x'], data['y'])
plt.title('Sample Plot on KENET HPC')
plt.show()
There are three main cell types in JupyterLab. Code cells contain executable code and are the default type. Markdown cells contain formatted text, equations, and documentation. Raw cells contain plain text that is not executed or formatted.
Installing Python Packages
You can install additional Python packages directly from within a notebook cell. Always use the --user flag to install packages in your personal home directory rather than system-wide:
# In a notebook cell !pip install --user seaborn scikit-learn # or alternatively %pip install --user packagename
Important: Always use the --user flag to install packages in your home directory, not system-wide. System directories are read-only and attempts to install there will fail.
Common Tasks
Task 1: Loading Data from Cluster Storage
Loading data from files stored on the cluster is straightforward using pandas or other data libraries. You can read data from your home directory or from the faster scratch storage:
