Using a Data Science Virtual Machine

If you have been provisioned a Data Science Virtual Machine (DSVM) it will arrive in your workspace with the most common Data Science tools already installed and ready to go. You can still install your own software, and everything will work in the same way as a standard Virtual Machine.

You can read more about how to access your VM here and how to use your VM here.

Performance and Compute power

The standard set of specifications for a DSVM are 4 vCPU, 16 GB memory, and 28 GB temp storage. Should you require additional power for you project, a higher spec machine can be provided: see this article for details. Some features of the DSVM may require additional VM power, for example when GPU is added, it can take advantage of the preinstalled libraries and tools for running Machine Learning tasks.

If you run into any issues or have any questions about Data Science Virtual Machines, please contact your Workspace Administrator or Aridhia Service Desk.

What’s included in your Data science VM

The table at the bottom of this article sets out the most common data science tools that are available in a DSVM. You can find the full list of tools, as well as more information about each specific tool on Microsoft's website.

Please note that Visual Studio 2019 community, Microsoft Teams, and Office365 will not be included in a Windows DSVM due to licensing issues.

Preinstalled data science tools

The data science VM images come with some pre-installed common software and packages including:

The full list of installed software is:

Tool Windows Linux
Jupyter Notebook Server with kernels for: R, Python, Julia, and PySpark Yes Yes
Anaconda Python Yes Yes
RStudio Desktop Yes Yes
Visual Studio Code Yes Yes
Julia (Julialang) Yes Yes
Node.js Yes Yes
JupyterLab Yes Yes
JupyterHub No Yes
CRAN-R Yes Yes
Notepad++ Yes No
Nano Yes No
PyCharm Community Edition Yes Yes
RStudio Server (Disabled by default) No Yes
IntelliJ IDEA No Yes
Vim No Yes
Emacs No Yes
Git and Git Bash Yes Yes
OpenJDK 11 Yes Yes
.NET Framework Yes No
Azure SDK Yes Yes
Microsoft Edge browser Yes Yes
Power BI Desktop Yes No
SQL Server 2019 Developer Edition Yes Yes
SQuirreL SQL No Yes
SQL Server Management Studio Yes No
Azure Storage Explorer Yes Yes
Azure CLI Yes Yes
AzCopy Yes No
Blob FUSE driver No Yes
Azure Cosmos DB Data Migration Tool Yes No
Unix/Linux command-line tools No Yes
Apache Spark 3.1 (standalone) Yes Yes
CUDA, cuDNN, NVIDIA Driver Yes Yes
Horovod No Yes
NVidia System Management Interface (nvidia-smi) Yes Yes
PyTorch Yes Yes
TensorFlow Yes Yes
Integration with Azure Machine Learning (Python) Yes Yes
XGBoost Yes Yes
Vowpal Wabbit Yes Yes
Weka No Yes
LightGBM No Yes
H2O No Yes
CatBoost No Yes
Intel MKL No Yes
OpenCV No Yes
Dlib No Yes
Docker Yes Yes
Nccl No Yes
Rattle No Yes
ONNX Runtime No Yes
Updated on February 2, 2022

Was this article helpful?

Related Articles

Not the solution you were looking for?
Click the link below to submit a support ticket