Subscribe our newsletter. link

Setup

 Setting up our development environment for local development. 

In this lesson, we'll setup the development environment that we'll be using in all of our lessons. We'll have instructions for  local laptop  While everything will work locally on your laptop.



Cluster 

We'll start with defining our cluster, which refers to a group of servers that come together to form one system. Our clusters will have a head node that manages the cluster and it will be connected to a set of worker nodes that will execute workloads for us. These clusters can be fixed in size or autoscale based on our application's compute needs, which makes them highly scalable and performant. We'll create our cluster by defining a compute configuration and an environment.   

Environment 

We'll start by defining our cluster environment which will specify the software dependencies that we'll need for our workloads. 

💻 Local 

Your personal laptop will need to have Python installed and we highly recommend using Python 3.10. You can use a tool like pyenv (mac) or pyenv-win (windows) to easily download and switch between Python versions. 

pyenv install 3.10.11 # install 
pyenv global 3.10.11 # set default 
 

Once we have our Python version, we can create a virtual environment to install our dependencies. We'll download our Python dependencies after we clone our repository from git shortly. 

mkdir madewithml 
cd madewithml 
python3 -m venv venv # create virtual environment 
source venv/bin/activate # on Windows: venv\Scripts\activate 
python3 -m pip install --upgrade pip setuptools wheel 

 

Compute 

Next, we'll define our compute configuration, which will specify our hardware dependencies (head and worker nodes) that we'll need for our workloads. 

💻 Local 

Your personal laptop (single machine) will act as the cluster, where one CPU will be the head node and some of the remaining CPU will be the worker nodes (no GPUs required). All of the code in this course will work in any personal laptop though it will be slower than executing the same workloads on a larger cluster. 

 Workspaces 

With our compute and environment defined, we're ready to create our cluster workspace. This is where we'll be developing our ML application on top of our compute, environment and storage. 

💻 Local 


Your personal laptop will need to have an interactive development environment (IDE) installed, such as VS Code. For bash commands in this course, you're welcome to use the terminal on VSCode or a separate one. 



Git 

With our development workspace all set up, we're ready to start developing. We'll start by following these instructions to create a repository: 

  • Create a new repository 
  • name it Made-With-ML 
  • Toggle Add a README file (very important as this creates a main branch) 
  • Scroll down and click Create repository 
  • Now we're ready to clone the Made With ML repository's contents from GitHub inside our madewithml directory. 

export GITHUB_USERNAME="YOUR_GITHUB_UESRNAME" # <-- CHANGE THIS to your username 
git clone https://github.com/GokuMohandas/Made-With-ML.git . 
git remote set-url origin https://github.com/$GITHUB_USERNAME/Made-With-ML.git 
git checkout -b dev 
export PYTHONPATH=$PYTHONPATH:$PWD # so we can import modules from our scripts 
 

💻 Local 

Recall that we created our virtual environment earlier but didn't actually load any Python dependencies yet. We'll clone our repository and then install the packages using the requirements.txt file. 

python3 -m pip install -r requirements.txt 
 

Caution: make sure that we're installing our Python packages inside our virtual environment. 




Notebook 

Now we're ready to launch our Jupyter notebook to interactively develop our ML application. 

💻 Local 

We already installed jupyter through our requirements.txt file in the previous step, so we can just launch it. 

jupyter lab notebooks/madewithml.ipynb 


Notebook 

Now we're ready to launch our Jupyter notebook to interactively develop our ML application. 

💻 Local 

We already installed jupyter through our requirements.txt file in the previous step, so we can just launch it. 

jupyter lab notebooks/madewithml.ipynb 

Ray 

We'll be using Ray to scale and productionize our ML application. Ray consists of a core distributed runtime along with libraries for scaling ML workloads and has companies like OpenAI, Spotify, Netflix, Instacart, Doordash + many more using it to develop their ML applications. We're going to start by initializing Ray inside our notebooks: 

1, import ray 
 

1 
2 
3 
4, # Initialize Ray 
if ray.is_initialized(): 
ray.shutdown() 
ray.init() 
 

We can also view our cluster resources to view the available compute resources: 

1, ray.cluster_resources() 
 

💻 Local 

If you are running this on a local laptop (no GPU), use the CPU count from ray.cluster_resources() to set your resources. For example if your machine has 10 CPUs: 

{'CPU': 10.0, 
'object_store_memory': 2147483648.0, 
'node:127.0.0.1': 1.0} 



Head on over to the next lesson, where we'll motivate the specific application that we're trying to build from a product and systems design perspective. And after that, we're ready to start developing! 


إرسال تعليق