Marconi100: job level GPU usage and accounting now available

Error message

Deprecated function: unserialize(): Passing null to parameter #1 ($data) of type string is deprecated in css_injector_init() (line 53 of /prod_service02/web-prod/hpc-web/sites/all/modules/css_injector/css_injector.module).

07/07/2021

Dear Users,

during last M100 maintenance, we configured SLURM resource manager so to collect statistics on the GPU usage and accounting for each job. The service is based on NVIDIA Data Center GPU Manager (DCGM), and produces a report per node, for all the requested GPUs, at the end of each job. The reports are saved in the job submit directory, in files named "dcgmi_stats_<nodename>_<jobid>.out".

The report contains statistics on the GPU usage (Power and Memory usage, etc.) for your run, and the assessment on the overall health state of GPUs.

Best regards,

HPC User Support @ CINECA

Menu utility

Main menu

News

Help desk

Center news

ERRATA CORRIGE: LEONARDO scratch storage upgrade on November 22

LEONARDO scratch storage upgrade on November 21

EuroHPC User Day

MARCONI back to production

You are here

Marconi100: job level GPU usage and accounting now available

Error message

Menu utility

Search form

Main menu

News

Help desk

Center news

ERRATA CORRIGE: LEONARDO scratch storage upgrade on November 22

LEONARDO scratch storage upgrade on November 21

EuroHPC User Day

MARCONI back to production

You are here

Marconi100: job level GPU usage and accounting now available

Error message