Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • mlange/VSCode_remote_HPC
  • umarka/VSCode_remote_HPC
  • lhausammann/VSCode_remote_HPC
  • zhouhao/VSCode_remote_HPC
  • sfux/VSCode_remote_HPC
5 results
Show changes
Commits on Source (29)
......@@ -44,7 +44,7 @@ The preparation steps only need to be executed once. You need to carry out those
* Start and interactive job with
```
bsub -Is -W 0:10 -n 1 -R "rusage[mem=2048]" bash
srun --ntasks=1 --time=00:10:00 --mem-per-cpu=2048 --pty bash
```
When using Euler, switch to the new software stack (in case you haven't set it as default yet), either using
......@@ -89,13 +89,7 @@ After the server started, terminate it with ctrl+c
### Install
Download the repository with the commnad
```
git clone https://gitlab.ethz.ch/sfux/VSCode_remote_HPC
```
Mac OS X:
Download the repository with the command
```
git clone https://gitlab.ethz.ch/sfux/VSCode_remote_HPC.git
......@@ -117,6 +111,7 @@ Options:
-n | --numcores NUM_CPU Number of CPU cores to be used on the cluster
-W | --runtime RUN_TIME Run time limit for the code-server in hours and minutes HH:MM
-m | --memory MEM_PER_CORE Memory limit in MB per core
-b | --batchsys BATCH_SYS Batch system to use (LSF or SLURM)
Optional arguments:
......@@ -126,12 +121,14 @@ Optional arguments:
-i | --interval INTERVAL Time interval for checking if the job on the cluster already started
-k | --key SSH_KEY_PATH Path to SSH key with non-standard name
-v | --version Display version of the script and exit
-j | --jobargs JOB_ARGS Additional job arguments
Examples:
./start_vscode.sh -u sfux -n 4 -W 04:00 -m 2048
./start_vscode.sh -u sfux -b SLURM -n 4 -W 04:00 -m 2048
./start_vscode.sh --username sfux --numcores 2 --runtime 01:30 --memory 2048
./start_vscode.sh --username sfux --batchsys SLURM --numcores 2 --runtime 01:30 --memory 2048
./start_vscode.sh -c /c/Users/sfux/.vsc_config
......@@ -144,6 +141,8 @@ VSC_RUN_TIME="01:00" # Run time limit for the code-server in hours and mi
VSC_MEM_PER_CPU_CORE=1024 # Memory limit in MB per core
VSC_WAITING_INTERVAL=60 # Time interval to check if the job on the cluster already started
VSC_SSH_KEY_PATH="" # Path to SSH key with non-standard name
VSC_BATCH_SYSTEM="SLURM" # Batch system to use (SLURM or LSF)
VSC_JOB_ARGS="" # Additional job arguments
```
### Reconnect to a code-server session
......@@ -165,3 +164,5 @@ This example is from a Linux computer. If you are using git bash on Windows, the
## Contributions
* Andreas Lugmayr
* Mike Boss
* Nadia Marounina
#!/bin/bash
if [[ $# -lt 1 ]]
then
echo -e "Error: No ETH username is specified, terminating script\n"
exit 1
fi
VSC_USERNAME=$1
VSC_TUNNEL=$(cat reconnect_info | grep -o -E '[0-9]+:([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+):[0-9]+')
TUNNEL_JOBS=$(ps -u | grep $VSC_TUNNEL | grep ssh | awk '{ print $2 }')
for TUNNEL_JOB in $TUNNEL_JOBS; do echo $TUNNEL_JOB; kill $TUNNEL_JOB; done
ssh -T $VSC_USERNAME@euler.ethz.ch bkill $(cat reconnect_info | grep BJOB | awk '{print $NF}')
......@@ -2,16 +2,19 @@
###############################################################################
# #
# Script to run on a local computer to start a code-server on Euler and #
# connect it with a local browser to it #
# Script for local computer to start a code-server on Euler and use a SSH- #
# tunnel to connect it with a local browser #
# #
# Main author : Samuel Fux #
# Contributions : Andreas Lugmayr #
# Date : October 2021 #
# Contributions : Andreas Lugmayr, Mike Boss, Nadia Marounina, #
# Haoyang Zhou, Loïc Hausammann #
# Date : Oct 2021-2023 #
# Location : ETH Zurich #
# Version : 0.1 #
# Change history : #
# #
# 05.05.2023 Added variable for standard port #
# 24.10.2022 Added Slurm support #
# 19.05.2022 JOBID is now saved to reconnect_info file #
# 28.10.2021 Initial version of the script based on Jupyter script #
# #
###############################################################################
......@@ -21,7 +24,7 @@
###############################################################################
# Version
VSC_VERSION="0.1"
VSC_VERSION="0.5"
# Script directory
VSC_SCRIPTDIR=$(pwd)
......@@ -34,30 +37,39 @@ VSC_HOSTNAME="euler.ethz.ch"
# 2. Command line options overwrite defaults
# 3. Config file options overwrite command line options
# Configuration file default : $HOME/.vsc_config
# Configuration file default : $HOME/.vsc_config
VSC_CONFIG_FILE="$HOME/.vsc_config"
# Username default : no default
# Username default : no default
VSC_USERNAME=""
# Number of CPU cores default : 1 CPU core
# Number of CPU cores default : 1 CPU core
VSC_NUM_CPU=1
# Runtime limit default : 1:00 hour
# Runtime limit default : 1:00 hour
VSC_RUN_TIME="01:00"
# Memory default : 1024 MB per core
# Memory default : 1024 MB per core
VSC_MEM_PER_CPU_CORE=1024
# Number of GPUs default : 0 GPUs
# Number of GPUs default : 0 GPUs
VSC_NUM_GPU=0
# Waiting interval default : 60 seconds
# Waiting interval default : 60 seconds
VSC_WAITING_INTERVAL=60
# SSH key location default : no default
# SSH key location default : no default
VSC_SSH_KEY_PATH=""
# Batch system : Slurm
VSC_BATCH_SYSTEM="SLURM"
# Additional job arguments default : no default
VSC_JOB_ARGS=""
# Standard port for code-server : 8899
VSC_REMOTE_PORT=8899
###############################################################################
# Usage instructions #
###############################################################################
......@@ -68,12 +80,13 @@ $0: Script to start a VSCode on Euler from a local computer
Usage: start_vscode.sh [options]
Options:
Required options:
-u | --username USERNAME ETH username for SSH connection to Euler
-b | --batchsys BATCH_SYS Batch system to use (LSF or SLURM)
-m | --memory MEM_PER_CORE Memory limit in MB per core
-n | --numcores NUM_CPU Number of CPU cores to be used on the cluster
-u | --username USERNAME ETH username for SSH connection to Euler
-W | --runtime RUN_TIME Run time limit for the code-server in hours and minutes HH:MM
-m | --memory MEM_PER_CORE Memory limit in MB per core
Optional arguments:
......@@ -81,14 +94,17 @@ Optional arguments:
-g | --numgpu NUM_GPU Number of GPUs to be used on the cluster
-h | --help Display help for this script and quit
-i | --interval INTERVAL Time interval for checking if the job on the cluster already started
-j | --jobargs JOB_ARGS Additional job arguments
-k | --key SSH_KEY_PATH Path to SSH key with non-standard name
-p | --port CS_PORT Port number to be used by code-server
-v | --version Display version of the script and exit
Examlples:
./start_vscode.sh -u sfux -n 4 -W 04:00 -m 2048
Examples:
./start_vscode.sh --username sfux --numcores 2 --runtime 01:30 --memory 2048
./start_vscode.sh -u sfux -b SLURM -n 4 -W 04:00 -m 2048
./start_vscode.sh --username sfux --batchsys SLURM --numcores 2 --runtime 01:30 --memory 2048
./start_vscode.sh -c $HOME/.vsc_config
......@@ -101,6 +117,9 @@ VSC_RUN_TIME="01:00" # Run time limit for the code-server in hours and mi
VSC_MEM_PER_CPU_CORE=1024 # Memory limit in MB per core
VSC_WAITING_INTERVAL=60 # Time interval to check if the job on the cluster already started
VSC_SSH_KEY_PATH="" # Path to SSH key with non-standard name
VSC_BATCH_SYSTEM="SLURM" # Batch system to use (SLURM or LSF)
VSC_JOB_ARGS="" # Additional job arguments
VSC_REMOTE_PORT=8899 # Port to be used with the code-server
EOF
exit 1
......@@ -160,6 +179,19 @@ do
shift
shift
;;
-b|--batchsys)
VSC_BATCH_SYSTEM=$2
shift
shift
;;
-j|--jobargs)
VSC_JOB_ARGS=$2
shift
shift
;;
-p|--port)
VSC_REMOTE_PORT=$2
;;
*)
echo -e "Warning: ignoring unknown option $1 \n"
shift
......@@ -267,6 +299,28 @@ else
echo -e "Using SSH key $VSC_SSH_KEY_PATH"
fi
# check if VSC_BATCH_SYSTEM is set to SLURM or LSF
case $VSC_BATCH_SYSTEM in
LSF)
echo -e "Using LSF batch system"
;;
SLURM)
echo -e "Using Slurm batch system"
;;
*)
echo -e "Error: Unknown batch system $VSC_BATCH_SYSTEM. Please either specify LSF or SLURM as batch system"
;;
esac
# check if VSC_REMOTE_PORT an integer
if ! [[ "$VSC_REMOTE_PORT" =~ ^[0-9]+$ ]]; then
echo -e "Error: $VSC_REMOTE_PORT -> Incorrect format. Please specify the port number as an integer and try again\n"
display_help
fi
echo -e "Using port number $VSC_REMOTE_PORT for the code-server"
# put together string for SSH options
VSC_SSH_OPT="$VSC_SKPATH $VSC_USERNAME@$VSC_HOSTNAME"
......@@ -296,15 +350,42 @@ ENDSSH
###############################################################################
# run the code-server job on Euler and save the ip of the compute node in the file vscip in the home directory of the user on Euler
echo -e "Connecting to $VSC_HOSTNAME to start the code-server in a batch job"
# FIXME: save jobid in a variable, that the script can kill the batch job at the end
ssh $VSC_SSH_OPT bsub -n $VSC_NUM_CPU -W $VSC_RUN_TIME -R "rusage[mem=$VSC_MEM_PER_CPU_CORE]" $VSC_SNUM_GPU <<ENDBSUB
echo -e "Connecting to $VSC_HOSTNAME to start the code-server in a $BATCH_SYS batch job"
case $VSC_BATCH_SYSTEM in
"LSF")
VSC_BJOB_OUT=$(ssh $VSC_SSH_OPT bsub -n $VSC_NUM_CPU -W $VSC_RUN_TIME -R "rusage[mem=$VSC_MEM_PER_CPU_CORE]" $VSC_SNUM_GPU $VSC_JOB_ARGS<<ENDBSUB
module load $VSC_MODULE_COMMAND
export XDG_RUNTIME_DIR="\$HOME/vsc_runtime"
VSC_IP_REMOTE="\$(hostname -i)"
echo "Remote IP:\$VSC_IP_REMOTE" >> /cluster/home/$VSC_USERNAME/vscip
code-server --bind-addr=\${VSC_IP_REMOTE}:\${VSC_REMOTE_PORT}
ENDBSUB
) ;;
"SLURM")
VSC_RUN_TIME="${VSC_RUN_TIME}":00" "
if [ "$VSC_NUM_GPU" -gt "0" ]; then
VSC_SNUM_GPU="-G $VSC_NUM_GPU"
fi
VSC_BJOB_OUT=$(ssh $VSC_SSH_OPT sbatch --ntasks=1 --cpus-per-task=$VSC_NUM_CPU "--time=$VSC_RUN_TIME" "--mem-per-cpu=$VSC_MEM_PER_CPU_CORE" -e "error.dat" $VSC_SNUM_GPU $VSC_JOB_ARGS<<ENDBSUB
#!/bin/bash
module load $VSC_MODULE_COMMAND
export XDG_RUNTIME_DIR="\$HOME/vsc_runtime"
VSC_IP_REMOTE="\$(hostname -i)"
echo "Remote IP:\$VSC_IP_REMOTE" >> /cluster/home/$VSC_USERNAME/vscip
code-server --bind-addr=\${VSC_IP_REMOTE}:8899
code-server --bind-addr=\${VSC_IP_REMOTE}:\${VSC_REMOTE_PORT}
ENDBSUB
)
;;
esac
# TODO: get jobid for both cases (LSF/Slurm)
# store jobid in a variable
# VSC_BJOB_ID=$(echo $VSC_BJOB_OUT | awk '/is submitted/{print substr($2, 2, length($2)-2);}')
# wait until batch job has started, poll every $VSC_WAITING_INTERVAL seconds to check if /cluster/home/$VSC_USERNAME/vscip exists
# once the file exists and is not empty the batch job has started
......@@ -318,10 +399,9 @@ ENDSSH
# give the code-server a few seconds to start
sleep 7
# get remote ip, port and token from files stored on Euler
# get remote ip and token from files stored on Euler
echo -e "Receiving ip, port and token from the code-server"
VSC_REMOTE_IP=$(ssh $VSC_SSH_OPT "cat /cluster/home/$VSC_USERNAME/vscip | grep -m1 'Remote IP' | cut -d ':' -f 2")
VSC_REMOTE_PORT=8899
# check if the IP, the port and the token are defined
if [[ "$VSC_REMOTE_IP" == "" ]]; then
......@@ -347,6 +427,10 @@ VSC_LOCAL_PORT=$((3 * 2**14 + RANDOM % 2**14))
echo -e "Using local port: $VSC_LOCAL_PORT"
# write reconnect_info file
#
# FIXME: add jobid
# BJOB ID : $VSC_BJOB_ID
cat <<EOF > $VSC_SCRIPTDIR/reconnect_info
Restart file
Remote IP address : $VSC_REMOTE_IP
......@@ -368,7 +452,7 @@ sleep 5
# save url in variable
VSC_URL=http://localhost:$VSC_LOCAL_PORT
echo -e "Starting browser and connecting it to the code-server"
echo -e "Connecting to url $VSc_URL"
echo -e "Connecting to url $VSC_URL"
# start local browser if possible
if [[ "$OSTYPE" == "linux-gnu" ]]; then
......@@ -380,4 +464,4 @@ elif [[ "$OSTYPE" == "msys" ]]; then # Git Bash on Windows 10
else
echo -e "Your operating system does not allow to start the browser automatically."
echo -e "Please open $VSC_URL in your browser."
fi
fi
\ No newline at end of file
......@@ -5,3 +5,4 @@ VSC_RUN_TIME="01:00" # Run time limit for the code-server in hours and mi
VSC_MEM_PER_CPU_CORE=1024 # Memory limit in MB per core
VSC_WAITING_INTERVAL=60 # Time interval to check if the job on the cluster already started
VSC_SSH_KEY_PATH="" # Path to SSH key with non-standard name
VSC_JOB_ARGS="" # Additional arguments when submitting the job