This guide will walk you through setting up a complete data science environment with Anaconda, Jupyter Notebooks, Python, IntelliJ IDEA, R, and RStudio. By the end, you’ll have a powerful and flexible workspace for all your data science projects.
1. Installing Anaconda#
Anaconda is a distribution of Python and R for scientific computing that simplifies package management and deployment.
Download and Install Anaconda#
Visit the Anaconda download page and get the appropriate version for your platform. This guide focuses on Linux, but the process is similar for Windows and macOS.
# For Linux, after downloading the .sh file, make it executable:
chmod +x Anaconda3-2023.X-Linux-x86_64.sh
# Run the installer
./Anaconda3-2023.X-Linux-x86_64.sh
# Follow the prompts to complete installation
# Remember to say "yes" when asked if you want to initialize Anaconda3
After installation, restart your terminal or run:
source ~/.bashrc
2. Setting Up Jupyter Notebooks with Enhancements#
Install Conda Extensions for Jupyter#
These extensions allow you to manage conda environments directly from Jupyter:
# Install nb_conda for environment management in Jupyter
conda install -c anaconda nb_conda
# If you need development features
conda install -c anaconda-nb-extensions/label/dev nb_conda
Install Jupyter Extensions#
Jupyter extensions add useful features like code formatting, table of contents, and more:
# Install contrib extensions (collection of community-contributed extensions)
conda install -c conda-forge jupyter_contrib_nbextensions
# Install configurator to enable/disable extensions via web interface
conda install -c conda-forge jupyter_nbextensions_configurator
Customize Jupyter Appearance#
You can customize the look and feel of Jupyter notebooks:
# Install Jupyter themes
conda install -c conda-forge jupyterthemes
# Set a dark theme (chesterish in this example)
jt -t chesterish
# Reset to default theme (if needed)
jt -r
3. Setting Up IntelliJ IDEA for Python Development#
IntelliJ IDEA with the Python plugin provides a powerful IDE for Python development.
Download and Install IntelliJ IDEA#
- Download IntelliJ IDEA from the official website
- Extract the downloaded archive:
tar -xzf ideaIC-*.tar.gz
- Navigate to the bin directory and run the setup script:
cd idea-IC-*/bin ./idea.sh
- Follow the installation prompts
- Make sure to select “Create Desktop Entry” for easy access
Configure Python Support in IntelliJ IDEA#
- Launch IntelliJ IDEA
- Install the Python Community Edition Plugin:
- Go to File → Settings → Plugins
- Search for “Python Community Edition”
- Click Install and restart the IDE when prompted
- Configure your Python/Conda environment:
- Go to File → Project Structure → SDKs
- Click the “+” button and select “Python SDK”
- Choose “Conda Environment” → “Existing environment”
- Navigate to your Anaconda installation (typically in ~/anaconda3)
- Or create a new virtual environment specific to your project
4. Setting Up R and RStudio Support#
Install R Kernel for Jupyter#
To use R within Jupyter notebooks:
# Install R and the IRKernel package
conda install -c r r-irkernel
Install RStudio (Optional)#
If you prefer a dedicated R environment alongside Jupyter:
# For Ubuntu/Debian-based distributions:
wget https://download1.rstudio.org/desktop/bionic/amd64/rstudio-2023.X.X-XXX-amd64.deb
sudo dpkg -i rstudio-2023.X.X-XXX-amd64.deb
sudo apt-get install -f # To resolve dependencies if needed
5. Launching Your Data Science Environment#
Starting Jupyter Notebook#
# Start Jupyter Notebook server
jupyter notebook
Your browser will open with the Jupyter interface. You can now create new notebooks with either Python or R kernels.
Using Jupyter Extensions#
- In the Jupyter interface, navigate to the “Nbextensions” tab
- Enable the extensions you want to use
- Return to the “Files” tab to create or open notebooks
Working with Projects in IntelliJ IDEA#
- Launch IntelliJ IDEA
- Select “New Project” or “Open”
- Choose “Python” as the project type
- Select your configured Python/Conda interpreter
- Start developing your Python code with full IDE support
Troubleshooting#
Common Issues#
- “Command not found” after installing Anaconda: Make sure to restart your terminal or source your
.bashrc
file. - Missing packages in Jupyter: Ensure you’ve activated the correct environment.
- IntelliJ not recognizing Python: Verify your Python SDK configuration in Project Structure.
Environment Management#
Keep track of your environments and installed packages:
# List all conda environments
conda env list
# List packages in current environment
conda list
# Export environment for sharing/backup
conda env export > environment.yml
# Create environment from exported file
conda env create -f environment.yml
This setup provides a complete data science environment with both GUI tools and command-line capabilities. You now have the flexibility to work with Python and R in both notebook and IDE formats, giving you the best of all worlds for data science projects.