Using a Data Science Virtual Machine

If you have been provisioned a Data Science Virtual Machine (DSVM) as a Virtual Machine, it will arrive in your workspace with the most common Data Science tools already installed and ready to go. You can still install your own software, and everything will work in the same way as a standard Virtual Machine.

You can read more about how to access your Virtual Machine here and how to use your Virtual Machine here.

Performance and Compute power

The standard set of specifications for a DSVM are 4 vCPU, 16 GB memory, and 28 GB temp storage. Should you require additional power for you project, a higher spec machine can be provided: see this article for details. Some features of the DSVM may require additional computing power, for example when GPU is added, it can take advantage of the preinstalled libraries and tools for running Machine Learning tasks.

If you run into any issues or have any questions about Data Science Virtual Machines, please contact your Workspace Administrator or Aridhia Service Desk.

What’s included in your Data science Virtual Machine

The table at the bottom of this article sets out the most common data science tools that are available in a DSVM. You can find the full list of tools, as well as more information about each specific tool on Microsoft's website.

Please note that Visual Studio 2019 community, Microsoft Teams, and Microsoft365 will not be included in a Windows DSVM due to licensing issues.

Preinstalled data science tools

The Data Science Virtual Machine images come with some pre-installed common software and packages including:

The full list of installed software is:

ToolWindowsLinux
Jupyter Notebook Server with kernels for: R, Python, Julia, and PySparkYesYes
Anaconda PythonYesYes
Visual Studio CodeYesYes
Julia (Julialang)YesYes
Node.jsYesYes
JupyterLabYesYes
JupyterHubNoYes
CRAN-RYesYes
Notepad++YesNo
NanoYesNo
PyCharm Community EditionYesYes
IntelliJ IDEANoYes
VimNoYes
EmacsNoYes
Git and Git BashYesYes
OpenJDK 11YesYes
.NET FrameworkYesNo
Azure SDKYesYes
Microsoft Edge browserYesYes
Power BI DesktopYesNo
SQL Server 2019 Developer EditionYesYes
SQuirreL SQLNoYes
SQL Server Management StudioYesNo
Azure Storage ExplorerYesYes
Azure CLIYesYes
AzCopyYesNo
Azure Cosmos DB Data Migration ToolYesNo
Unix/Linux command-line toolsNoYes
Apache Spark 3.1 (standalone)YesYes
CUDA, cuDNN, NVIDIA DriverYesYes
HorovodNoYes
NVidia System Management Interface (nvidia-smi)YesYes
PyTorchYesYes
TensorFlowYesYes
Integration with Azure Machine Learning (Python)YesYes
XGBoostYesYes
Vowpal WabbitYesYes
LightGBMNoYes
H2ONoYes
CatBoostNoYes
Intel MKLNoYes
OpenCVNoYes
DlibNoYes
DockerYesYes
NcclNoYes
ONNX RuntimeNoYes
Updated on August 31, 2023

Tagged:

Was this article helpful?