Using a Data Science Virtual Machine

If you have been provisioned a Data Science Virtual Machine (DSVM) as a Virtual Desktop, it will arrive in your workspace with the most common Data Science tools already installed and ready to go. You can still install your own software, and everything will work in the same way as a standard Virtual Desktop.

You can read more about how to access your Virtual Desktop here and how to use your Virtual Desktop here.

Performance and Compute power

The standard set of specifications for a DSVM are 4 vCPU, 16 GB memory, and 28 GB temp storage. Should you require additional power for you project, a higher spec machine can be provided: see this article for details. Some features of the DSVM may require additional computing power, for example when GPU is added, it can take advantage of the preinstalled libraries and tools for running Machine Learning tasks.

If you run into any issues or have any questions about Data Science Virtual Machines, please contact your Workspace Administrator or Aridhia Service Desk.

What’s included in your Data science Virtual Desktop

The table at the bottom of this article sets out the most common data science tools that are available in a DSVM. You can find the full list of tools, as well as more information about each specific tool on Microsoft's website.

Please note that Visual Studio 2019 community, Microsoft Teams, and Microsoft365 will not be included in a Windows DSVM due to licensing issues.

Preinstalled data science tools

The Data Science Virtual Desktop images come with some pre-installed common software and packages including:

The full list of installed software is:

Tool Windows Linux
Jupyter Notebook Server with kernels for: R, Python, Julia, and PySpark Yes Yes
Anaconda Python Yes Yes
Visual Studio Code Yes Yes
Julia (Julialang) Yes Yes
Node.js Yes Yes
JupyterLab Yes Yes
JupyterHub No Yes
CRAN-R Yes Yes
Notepad++ Yes No
Nano Yes No
PyCharm Community Edition Yes Yes
IntelliJ IDEA No Yes
Vim No Yes
Emacs No Yes
Git and Git Bash Yes Yes
OpenJDK 11 Yes Yes
.NET Framework Yes No
Azure SDK Yes Yes
Microsoft Edge browser Yes Yes
Power BI Desktop Yes No
SQL Server 2019 Developer Edition Yes Yes
SQuirreL SQL No Yes
SQL Server Management Studio Yes No
Azure Storage Explorer Yes Yes
Azure CLI Yes Yes
AzCopy Yes No
Azure Cosmos DB Data Migration Tool Yes No
Unix/Linux command-line tools No Yes
Apache Spark 3.1 (standalone) Yes Yes
CUDA, cuDNN, NVIDIA Driver Yes Yes
Horovod No Yes
NVidia System Management Interface (nvidia-smi) Yes Yes
PyTorch Yes Yes
TensorFlow Yes Yes
Integration with Azure Machine Learning (Python) Yes Yes
XGBoost Yes Yes
Vowpal Wabbit Yes Yes
LightGBM No Yes
H2O No Yes
CatBoost No Yes
Intel MKL No Yes
OpenCV No Yes
Dlib No Yes
Docker Yes Yes
Nccl No Yes
ONNX Runtime No Yes
Updated on April 14, 2023

Was this article helpful?

Related Articles

Not the solution you were looking for?
Click the link below to submit a support ticket