Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • mlange/VSCode_remote_HPC
  • umarka/VSCode_remote_HPC
  • lhausammann/VSCode_remote_HPC
  • zhouhao/VSCode_remote_HPC
  • sfux/VSCode_remote_HPC
5 results
Show changes
Commits on Source (29)
...@@ -44,7 +44,7 @@ The preparation steps only need to be executed once. You need to carry out those ...@@ -44,7 +44,7 @@ The preparation steps only need to be executed once. You need to carry out those
* Start and interactive job with * Start and interactive job with
``` ```
bsub -Is -W 0:10 -n 1 -R "rusage[mem=2048]" bash srun --ntasks=1 --time=00:10:00 --mem-per-cpu=2048 --pty bash
``` ```
When using Euler, switch to the new software stack (in case you haven't set it as default yet), either using When using Euler, switch to the new software stack (in case you haven't set it as default yet), either using
...@@ -89,13 +89,7 @@ After the server started, terminate it with ctrl+c ...@@ -89,13 +89,7 @@ After the server started, terminate it with ctrl+c
### Install ### Install
Download the repository with the commnad Download the repository with the command
```
git clone https://gitlab.ethz.ch/sfux/VSCode_remote_HPC
```
Mac OS X:
``` ```
git clone https://gitlab.ethz.ch/sfux/VSCode_remote_HPC.git git clone https://gitlab.ethz.ch/sfux/VSCode_remote_HPC.git
...@@ -117,6 +111,7 @@ Options: ...@@ -117,6 +111,7 @@ Options:
-n | --numcores NUM_CPU Number of CPU cores to be used on the cluster -n | --numcores NUM_CPU Number of CPU cores to be used on the cluster
-W | --runtime RUN_TIME Run time limit for the code-server in hours and minutes HH:MM -W | --runtime RUN_TIME Run time limit for the code-server in hours and minutes HH:MM
-m | --memory MEM_PER_CORE Memory limit in MB per core -m | --memory MEM_PER_CORE Memory limit in MB per core
-b | --batchsys BATCH_SYS Batch system to use (LSF or SLURM)
Optional arguments: Optional arguments:
...@@ -126,12 +121,14 @@ Optional arguments: ...@@ -126,12 +121,14 @@ Optional arguments:
-i | --interval INTERVAL Time interval for checking if the job on the cluster already started -i | --interval INTERVAL Time interval for checking if the job on the cluster already started
-k | --key SSH_KEY_PATH Path to SSH key with non-standard name -k | --key SSH_KEY_PATH Path to SSH key with non-standard name
-v | --version Display version of the script and exit -v | --version Display version of the script and exit
-j | --jobargs JOB_ARGS Additional job arguments
Examples: Examples:
./start_vscode.sh -u sfux -n 4 -W 04:00 -m 2048 ./start_vscode.sh -u sfux -b SLURM -n 4 -W 04:00 -m 2048
./start_vscode.sh --username sfux --numcores 2 --runtime 01:30 --memory 2048 ./start_vscode.sh --username sfux --batchsys SLURM --numcores 2 --runtime 01:30 --memory 2048
./start_vscode.sh -c /c/Users/sfux/.vsc_config ./start_vscode.sh -c /c/Users/sfux/.vsc_config
...@@ -144,6 +141,8 @@ VSC_RUN_TIME="01:00" # Run time limit for the code-server in hours and mi ...@@ -144,6 +141,8 @@ VSC_RUN_TIME="01:00" # Run time limit for the code-server in hours and mi
VSC_MEM_PER_CPU_CORE=1024 # Memory limit in MB per core VSC_MEM_PER_CPU_CORE=1024 # Memory limit in MB per core
VSC_WAITING_INTERVAL=60 # Time interval to check if the job on the cluster already started VSC_WAITING_INTERVAL=60 # Time interval to check if the job on the cluster already started
VSC_SSH_KEY_PATH="" # Path to SSH key with non-standard name VSC_SSH_KEY_PATH="" # Path to SSH key with non-standard name
VSC_BATCH_SYSTEM="SLURM" # Batch system to use (SLURM or LSF)
VSC_JOB_ARGS="" # Additional job arguments
``` ```
### Reconnect to a code-server session ### Reconnect to a code-server session
...@@ -165,3 +164,5 @@ This example is from a Linux computer. If you are using git bash on Windows, the ...@@ -165,3 +164,5 @@ This example is from a Linux computer. If you are using git bash on Windows, the
## Contributions ## Contributions
* Andreas Lugmayr * Andreas Lugmayr
* Mike Boss
* Nadia Marounina
#!/bin/bash
if [[ $# -lt 1 ]]
then
echo -e "Error: No ETH username is specified, terminating script\n"
exit 1
fi
VSC_USERNAME=$1
VSC_TUNNEL=$(cat reconnect_info | grep -o -E '[0-9]+:([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+):[0-9]+')
TUNNEL_JOBS=$(ps -u | grep $VSC_TUNNEL | grep ssh | awk '{ print $2 }')
for TUNNEL_JOB in $TUNNEL_JOBS; do echo $TUNNEL_JOB; kill $TUNNEL_JOB; done
ssh -T $VSC_USERNAME@euler.ethz.ch bkill $(cat reconnect_info | grep BJOB | awk '{print $NF}')
...@@ -2,16 +2,19 @@ ...@@ -2,16 +2,19 @@
############################################################################### ###############################################################################
# # # #
# Script to run on a local computer to start a code-server on Euler and # # Script for local computer to start a code-server on Euler and use a SSH- #
# connect it with a local browser to it # # tunnel to connect it with a local browser #
# # # #
# Main author : Samuel Fux # # Main author : Samuel Fux #
# Contributions : Andreas Lugmayr # # Contributions : Andreas Lugmayr, Mike Boss, Nadia Marounina, #
# Date : October 2021 # # Haoyang Zhou, Loïc Hausammann #
# Date : Oct 2021-2023 #
# Location : ETH Zurich # # Location : ETH Zurich #
# Version : 0.1 #
# Change history : # # Change history : #
# # # #
# 05.05.2023 Added variable for standard port #
# 24.10.2022 Added Slurm support #
# 19.05.2022 JOBID is now saved to reconnect_info file #
# 28.10.2021 Initial version of the script based on Jupyter script # # 28.10.2021 Initial version of the script based on Jupyter script #
# # # #
############################################################################### ###############################################################################
...@@ -21,7 +24,7 @@ ...@@ -21,7 +24,7 @@
############################################################################### ###############################################################################
# Version # Version
VSC_VERSION="0.1" VSC_VERSION="0.5"
# Script directory # Script directory
VSC_SCRIPTDIR=$(pwd) VSC_SCRIPTDIR=$(pwd)
...@@ -34,30 +37,39 @@ VSC_HOSTNAME="euler.ethz.ch" ...@@ -34,30 +37,39 @@ VSC_HOSTNAME="euler.ethz.ch"
# 2. Command line options overwrite defaults # 2. Command line options overwrite defaults
# 3. Config file options overwrite command line options # 3. Config file options overwrite command line options
# Configuration file default : $HOME/.vsc_config # Configuration file default : $HOME/.vsc_config
VSC_CONFIG_FILE="$HOME/.vsc_config" VSC_CONFIG_FILE="$HOME/.vsc_config"
# Username default : no default # Username default : no default
VSC_USERNAME="" VSC_USERNAME=""
# Number of CPU cores default : 1 CPU core # Number of CPU cores default : 1 CPU core
VSC_NUM_CPU=1 VSC_NUM_CPU=1
# Runtime limit default : 1:00 hour # Runtime limit default : 1:00 hour
VSC_RUN_TIME="01:00" VSC_RUN_TIME="01:00"
# Memory default : 1024 MB per core # Memory default : 1024 MB per core
VSC_MEM_PER_CPU_CORE=1024 VSC_MEM_PER_CPU_CORE=1024
# Number of GPUs default : 0 GPUs # Number of GPUs default : 0 GPUs
VSC_NUM_GPU=0 VSC_NUM_GPU=0
# Waiting interval default : 60 seconds # Waiting interval default : 60 seconds
VSC_WAITING_INTERVAL=60 VSC_WAITING_INTERVAL=60
# SSH key location default : no default # SSH key location default : no default
VSC_SSH_KEY_PATH="" VSC_SSH_KEY_PATH=""
# Batch system : Slurm
VSC_BATCH_SYSTEM="SLURM"
# Additional job arguments default : no default
VSC_JOB_ARGS=""
# Standard port for code-server : 8899
VSC_REMOTE_PORT=8899
############################################################################### ###############################################################################
# Usage instructions # # Usage instructions #
############################################################################### ###############################################################################
...@@ -68,12 +80,13 @@ $0: Script to start a VSCode on Euler from a local computer ...@@ -68,12 +80,13 @@ $0: Script to start a VSCode on Euler from a local computer
Usage: start_vscode.sh [options] Usage: start_vscode.sh [options]
Options: Required options:
-u | --username USERNAME ETH username for SSH connection to Euler -b | --batchsys BATCH_SYS Batch system to use (LSF or SLURM)
-m | --memory MEM_PER_CORE Memory limit in MB per core
-n | --numcores NUM_CPU Number of CPU cores to be used on the cluster -n | --numcores NUM_CPU Number of CPU cores to be used on the cluster
-u | --username USERNAME ETH username for SSH connection to Euler
-W | --runtime RUN_TIME Run time limit for the code-server in hours and minutes HH:MM -W | --runtime RUN_TIME Run time limit for the code-server in hours and minutes HH:MM
-m | --memory MEM_PER_CORE Memory limit in MB per core
Optional arguments: Optional arguments:
...@@ -81,14 +94,17 @@ Optional arguments: ...@@ -81,14 +94,17 @@ Optional arguments:
-g | --numgpu NUM_GPU Number of GPUs to be used on the cluster -g | --numgpu NUM_GPU Number of GPUs to be used on the cluster
-h | --help Display help for this script and quit -h | --help Display help for this script and quit
-i | --interval INTERVAL Time interval for checking if the job on the cluster already started -i | --interval INTERVAL Time interval for checking if the job on the cluster already started
-j | --jobargs JOB_ARGS Additional job arguments
-k | --key SSH_KEY_PATH Path to SSH key with non-standard name -k | --key SSH_KEY_PATH Path to SSH key with non-standard name
-p | --port CS_PORT Port number to be used by code-server
-v | --version Display version of the script and exit -v | --version Display version of the script and exit
Examlples:
./start_vscode.sh -u sfux -n 4 -W 04:00 -m 2048 Examples:
./start_vscode.sh --username sfux --numcores 2 --runtime 01:30 --memory 2048 ./start_vscode.sh -u sfux -b SLURM -n 4 -W 04:00 -m 2048
./start_vscode.sh --username sfux --batchsys SLURM --numcores 2 --runtime 01:30 --memory 2048
./start_vscode.sh -c $HOME/.vsc_config ./start_vscode.sh -c $HOME/.vsc_config
...@@ -101,6 +117,9 @@ VSC_RUN_TIME="01:00" # Run time limit for the code-server in hours and mi ...@@ -101,6 +117,9 @@ VSC_RUN_TIME="01:00" # Run time limit for the code-server in hours and mi
VSC_MEM_PER_CPU_CORE=1024 # Memory limit in MB per core VSC_MEM_PER_CPU_CORE=1024 # Memory limit in MB per core
VSC_WAITING_INTERVAL=60 # Time interval to check if the job on the cluster already started VSC_WAITING_INTERVAL=60 # Time interval to check if the job on the cluster already started
VSC_SSH_KEY_PATH="" # Path to SSH key with non-standard name VSC_SSH_KEY_PATH="" # Path to SSH key with non-standard name
VSC_BATCH_SYSTEM="SLURM" # Batch system to use (SLURM or LSF)
VSC_JOB_ARGS="" # Additional job arguments
VSC_REMOTE_PORT=8899 # Port to be used with the code-server
EOF EOF
exit 1 exit 1
...@@ -160,6 +179,19 @@ do ...@@ -160,6 +179,19 @@ do
shift shift
shift shift
;; ;;
-b|--batchsys)
VSC_BATCH_SYSTEM=$2
shift
shift
;;
-j|--jobargs)
VSC_JOB_ARGS=$2
shift
shift
;;
-p|--port)
VSC_REMOTE_PORT=$2
;;
*) *)
echo -e "Warning: ignoring unknown option $1 \n" echo -e "Warning: ignoring unknown option $1 \n"
shift shift
...@@ -267,6 +299,28 @@ else ...@@ -267,6 +299,28 @@ else
echo -e "Using SSH key $VSC_SSH_KEY_PATH" echo -e "Using SSH key $VSC_SSH_KEY_PATH"
fi fi
# check if VSC_BATCH_SYSTEM is set to SLURM or LSF
case $VSC_BATCH_SYSTEM in
LSF)
echo -e "Using LSF batch system"
;;
SLURM)
echo -e "Using Slurm batch system"
;;
*)
echo -e "Error: Unknown batch system $VSC_BATCH_SYSTEM. Please either specify LSF or SLURM as batch system"
;;
esac
# check if VSC_REMOTE_PORT an integer
if ! [[ "$VSC_REMOTE_PORT" =~ ^[0-9]+$ ]]; then
echo -e "Error: $VSC_REMOTE_PORT -> Incorrect format. Please specify the port number as an integer and try again\n"
display_help
fi
echo -e "Using port number $VSC_REMOTE_PORT for the code-server"
# put together string for SSH options # put together string for SSH options
VSC_SSH_OPT="$VSC_SKPATH $VSC_USERNAME@$VSC_HOSTNAME" VSC_SSH_OPT="$VSC_SKPATH $VSC_USERNAME@$VSC_HOSTNAME"
...@@ -296,15 +350,42 @@ ENDSSH ...@@ -296,15 +350,42 @@ ENDSSH
############################################################################### ###############################################################################
# run the code-server job on Euler and save the ip of the compute node in the file vscip in the home directory of the user on Euler # run the code-server job on Euler and save the ip of the compute node in the file vscip in the home directory of the user on Euler
echo -e "Connecting to $VSC_HOSTNAME to start the code-server in a batch job" echo -e "Connecting to $VSC_HOSTNAME to start the code-server in a $BATCH_SYS batch job"
# FIXME: save jobid in a variable, that the script can kill the batch job at the end case $VSC_BATCH_SYSTEM in
ssh $VSC_SSH_OPT bsub -n $VSC_NUM_CPU -W $VSC_RUN_TIME -R "rusage[mem=$VSC_MEM_PER_CPU_CORE]" $VSC_SNUM_GPU <<ENDBSUB "LSF")
VSC_BJOB_OUT=$(ssh $VSC_SSH_OPT bsub -n $VSC_NUM_CPU -W $VSC_RUN_TIME -R "rusage[mem=$VSC_MEM_PER_CPU_CORE]" $VSC_SNUM_GPU $VSC_JOB_ARGS<<ENDBSUB
module load $VSC_MODULE_COMMAND
export XDG_RUNTIME_DIR="\$HOME/vsc_runtime"
VSC_IP_REMOTE="\$(hostname -i)"
echo "Remote IP:\$VSC_IP_REMOTE" >> /cluster/home/$VSC_USERNAME/vscip
code-server --bind-addr=\${VSC_IP_REMOTE}:\${VSC_REMOTE_PORT}
ENDBSUB
) ;;
"SLURM")
VSC_RUN_TIME="${VSC_RUN_TIME}":00" "
if [ "$VSC_NUM_GPU" -gt "0" ]; then
VSC_SNUM_GPU="-G $VSC_NUM_GPU"
fi
VSC_BJOB_OUT=$(ssh $VSC_SSH_OPT sbatch --ntasks=1 --cpus-per-task=$VSC_NUM_CPU "--time=$VSC_RUN_TIME" "--mem-per-cpu=$VSC_MEM_PER_CPU_CORE" -e "error.dat" $VSC_SNUM_GPU $VSC_JOB_ARGS<<ENDBSUB
#!/bin/bash
module load $VSC_MODULE_COMMAND module load $VSC_MODULE_COMMAND
export XDG_RUNTIME_DIR="\$HOME/vsc_runtime" export XDG_RUNTIME_DIR="\$HOME/vsc_runtime"
VSC_IP_REMOTE="\$(hostname -i)" VSC_IP_REMOTE="\$(hostname -i)"
echo "Remote IP:\$VSC_IP_REMOTE" >> /cluster/home/$VSC_USERNAME/vscip echo "Remote IP:\$VSC_IP_REMOTE" >> /cluster/home/$VSC_USERNAME/vscip
code-server --bind-addr=\${VSC_IP_REMOTE}:8899 code-server --bind-addr=\${VSC_IP_REMOTE}:\${VSC_REMOTE_PORT}
ENDBSUB ENDBSUB
)
;;
esac
# TODO: get jobid for both cases (LSF/Slurm)
# store jobid in a variable
# VSC_BJOB_ID=$(echo $VSC_BJOB_OUT | awk '/is submitted/{print substr($2, 2, length($2)-2);}')
# wait until batch job has started, poll every $VSC_WAITING_INTERVAL seconds to check if /cluster/home/$VSC_USERNAME/vscip exists # wait until batch job has started, poll every $VSC_WAITING_INTERVAL seconds to check if /cluster/home/$VSC_USERNAME/vscip exists
# once the file exists and is not empty the batch job has started # once the file exists and is not empty the batch job has started
...@@ -318,10 +399,9 @@ ENDSSH ...@@ -318,10 +399,9 @@ ENDSSH
# give the code-server a few seconds to start # give the code-server a few seconds to start
sleep 7 sleep 7
# get remote ip, port and token from files stored on Euler # get remote ip and token from files stored on Euler
echo -e "Receiving ip, port and token from the code-server" echo -e "Receiving ip, port and token from the code-server"
VSC_REMOTE_IP=$(ssh $VSC_SSH_OPT "cat /cluster/home/$VSC_USERNAME/vscip | grep -m1 'Remote IP' | cut -d ':' -f 2") VSC_REMOTE_IP=$(ssh $VSC_SSH_OPT "cat /cluster/home/$VSC_USERNAME/vscip | grep -m1 'Remote IP' | cut -d ':' -f 2")
VSC_REMOTE_PORT=8899
# check if the IP, the port and the token are defined # check if the IP, the port and the token are defined
if [[ "$VSC_REMOTE_IP" == "" ]]; then if [[ "$VSC_REMOTE_IP" == "" ]]; then
...@@ -347,6 +427,10 @@ VSC_LOCAL_PORT=$((3 * 2**14 + RANDOM % 2**14)) ...@@ -347,6 +427,10 @@ VSC_LOCAL_PORT=$((3 * 2**14 + RANDOM % 2**14))
echo -e "Using local port: $VSC_LOCAL_PORT" echo -e "Using local port: $VSC_LOCAL_PORT"
# write reconnect_info file # write reconnect_info file
#
# FIXME: add jobid
# BJOB ID : $VSC_BJOB_ID
cat <<EOF > $VSC_SCRIPTDIR/reconnect_info cat <<EOF > $VSC_SCRIPTDIR/reconnect_info
Restart file Restart file
Remote IP address : $VSC_REMOTE_IP Remote IP address : $VSC_REMOTE_IP
...@@ -368,7 +452,7 @@ sleep 5 ...@@ -368,7 +452,7 @@ sleep 5
# save url in variable # save url in variable
VSC_URL=http://localhost:$VSC_LOCAL_PORT VSC_URL=http://localhost:$VSC_LOCAL_PORT
echo -e "Starting browser and connecting it to the code-server" echo -e "Starting browser and connecting it to the code-server"
echo -e "Connecting to url $VSc_URL" echo -e "Connecting to url $VSC_URL"
# start local browser if possible # start local browser if possible
if [[ "$OSTYPE" == "linux-gnu" ]]; then if [[ "$OSTYPE" == "linux-gnu" ]]; then
...@@ -380,4 +464,4 @@ elif [[ "$OSTYPE" == "msys" ]]; then # Git Bash on Windows 10 ...@@ -380,4 +464,4 @@ elif [[ "$OSTYPE" == "msys" ]]; then # Git Bash on Windows 10
else else
echo -e "Your operating system does not allow to start the browser automatically." echo -e "Your operating system does not allow to start the browser automatically."
echo -e "Please open $VSC_URL in your browser." echo -e "Please open $VSC_URL in your browser."
fi fi
\ No newline at end of file
...@@ -5,3 +5,4 @@ VSC_RUN_TIME="01:00" # Run time limit for the code-server in hours and mi ...@@ -5,3 +5,4 @@ VSC_RUN_TIME="01:00" # Run time limit for the code-server in hours and mi
VSC_MEM_PER_CPU_CORE=1024 # Memory limit in MB per core VSC_MEM_PER_CPU_CORE=1024 # Memory limit in MB per core
VSC_WAITING_INTERVAL=60 # Time interval to check if the job on the cluster already started VSC_WAITING_INTERVAL=60 # Time interval to check if the job on the cluster already started
VSC_SSH_KEY_PATH="" # Path to SSH key with non-standard name VSC_SSH_KEY_PATH="" # Path to SSH key with non-standard name
VSC_JOB_ARGS="" # Additional arguments when submitting the job