snakemake how to encode pair analisys - bioinformatics

I want to use gatk recalibration using pair sample ( tumor and normal). I need to parse the data using pandas. That is what I wroted.
expand("mapped_reads/merged_samples/{sample[1][tumor]}/{sample[1][tumor]}_{sample[1][normal]}.bam", sample=read_table(config["conditions"], ",").iterrows())
this is the condition file:
432,433
434,435
I wrote this rule:
rule gatk_RealignerTargetCreator:
input:
"mapped_reads/merged_samples/{tumor}.sorted.dup.reca.bam",
"mapped_reads/merged_samples/{normal}.sorted.dup.reca.bam",
output:
"mapped_reads/merged_samples/{tumor}/{tumor}_{normal}.realign.intervals"
params:
genome=config['reference']['genome_fasta'],
mills= config['mills'],
ph1_indels= config['know_phy'],
log:
"mapped_reads/merged_samples/logs/{tumor}_{normal}.realign_info.log"
threads: 8
shell:
"gatk -T RealignerTargetCreator -R {params.genome} {params.custom} "
"-nt {threads} "
"-I {wildcard.tumor} -I {wildcard.normal} -known {params.ph1_indels} "
"-o {output} >& {log}"
I have this error:
InputFunctionException in line 17 of /home/maurizio/Desktop/TEST_exome/rules/samfiles.rules:
KeyError: '432/432_433'
Wildcards:
sample=432/432_433
this is the samfiles.rules:
rule samtools_merge_bam:
"""
Merge bam files for multiple units into one for the given sample.
If the sample has only one unit, files will be copied.
"""
input:
lambda wildcards: expand("mapped_reads/bam/{unit}_sorted.bam",unit=config["samples"][wildcards.sample])
output:
"mapped_reads/merged_samples/{sample}.bam"
benchmark:
"benchmarks/samtools/merge/{sample}.txt"
run:
if len(input) > 1:
shell("/illumina/software/PROG2/samtools-1.3.1/samtools merge {output} {input}")
else:
shell("cp {input} {output} && touch -h {output}")

I can only guess because you don't show all relevant rule, but I would say the error occurs because the rule samtools_merge_bam also applies to some later bam file where you have the pattern {tumor}/{tumor}_{normal}...
As a solution, you have to resolve this ambiguity (see the snakemake tutorial). For example, you can constrain the wildcard of samtools_merge_bam to not contain any slashes.
wildcard_constraints:
sample="[^/]+"
You can put the constraint either globally or inside your samtools_merge_bam rule.

Related

baseDir issue with nextflow

This might be a very basic question for you guys, however, I am have just started with nextflow and I struggling with the simplest example.
I first explain what I have done and the problem.
Aim: I aim to make a workflow for my bioinformatics analyses as the one here (https://www.nextflow.io/example4.html)
Background: I have installed all the packages that were needed and they all work from the console without any error.
My run: I have used the same script as in example only by replacing the directory names. Here is how I have arranged the directories
location of script
~/raman/nflow/script.nf
location of Fastq files
~/raman/nflow/Data/T4_1.fq.gz
~/raman/nflow/Data/T4_2.fq.gz
Location of transcriptomic file
~/raman/nflow/Genome/trans.fa
The script
#!/usr/bin/env nextflow
/*
* The following pipeline parameters specify the refence genomes
* and read pairs and can be provided as command line options
*/
params.reads = "$baseDir/Data/T4_{1,2}.fq.gz"
params.transcriptome = "$baseDir/HumanGenome/SalmonIndex/gencode.v42.transcripts.fa"
params.outdir = "results"
workflow {
read_pairs_ch = channel.fromFilePairs( params.reads, checkIfExists: true )
INDEX(params.transcriptome)
FASTQC(read_pairs_ch)
QUANT(INDEX.out, read_pairs_ch)
}
process INDEX {
tag "$transcriptome.simpleName"
input:
path transcriptome
output:
path 'index'
script:
"""
salmon index --threads $task.cpus -t $transcriptome -i index
"""
}
process FASTQC {
tag "FASTQC on $sample_id"
publishDir params.outdir
input:
tuple val(sample_id), path(reads)
output:
path "fastqc_${sample_id}_logs"
script:
"""
fastqc "$sample_id" "$reads"
"""
}
process QUANT {
tag "$pair_id"
publishDir params.outdir
input:
path index
tuple val(pair_id), path(reads)
output:
path pair_id
script:
"""
salmon quant --threads $task.cpus --libType=U -i $index -1 ${reads[0]} -2 ${reads[1]} -o $pair_id
"""
}
Output:
(base) ntr#ser:~/raman/nflow$ nextflow script.nf
N E X T F L O W ~ version 22.10.1
Launching `script.nf` [modest_meninsky] DSL2 - revision: 032a643b56
executor > local (2)
executor > local (2)
[- ] process > INDEX (gencode) -
[28/02cde5] process > FASTQC (FASTQC on T4) [100%] 1 of 1, failed: 1 ✘
[- ] process > QUANT -
Error executing process > 'FASTQC (FASTQC on T4)'
Caused by:
Missing output file(s) `fastqc_T4_logs` expected by process `FASTQC (FASTQC on T4)`
Command executed:
fastqc "T4" "T4_1.fq.gz T4_2.fq.gz"
Command exit status:
0
Command output:
(empty)
Command error:
Skipping 'T4' which didn't exist, or couldn't be read
Skipping 'T4_1.fq.gz T4_2.fq.gz' which didn't exist, or couldn't be read
Work dir:
/home/ruby/raman/nflow/work/28/02cde5184f4accf9a05bc2ded29c50
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
I believe I have an issue with my baseDir understanding. I am assuming that the baseDir is the one where I have my file script.nf I am not sure what is going wrong and how can I fix it.
Could anyone please help or guide.
Thank you
Caused by:
Missing output file(s) `fastqc_T4_logs` expected by process `FASTQC (FASTQC on T4)`
Nextflow complains when it can't find the declared output files. This can occur even if the command completes successfully, i.e. with exit status 0. The problem here is that fastqc simply skips files that don't exist or can't be read (e.g. permissions problems), but it does produce these warnings:
Skipping 'T4' which didn't exist, or couldn't be read
Skipping 'T4_1.fq.gz T4_2.fq.gz' which didn't exist, or couldn't be read
The solution is to just make sure all files exist. Note that the fromFilePairs factory method produces a list of files in the second element. Therefore quoting a space-separated pair of filenames is also problematic. All you need is:
script:
"""
fastqc ${reads}
"""

RGI output filename does not match snakefile output filename

The problem is:
A software called "RGI" will automatically append .txt as suffix to the output file. So if my sampleID is 7. Then the actual RGI output file will be 7.txt, which is different from the output file (7) defind in the snakefile rule. And snakemake will report errors like Job Missing files after 20 seconds. However, RGI still appends .txt as suffix even if you have preset a suffix (and the actual output file will look like 7.txt.txt).
How can I solve the problem?
The following is a part of my code:
rule rgi:
output:
cardTxt = "{sampleId}/annotation/rgi/{sampleId}"
input:
faa = rules.prokka.output.faa,
cardDb = config['rgi']['cardDb']
shell:
"""
rgi load -i {input.cardDb}
rgi main -i {input.faa} -t protein -o {output.cardTxt} --include_loose --clean
"""
Strip the .txt prefix from the output filename before passing it to rgi. I do this here using bash string manipulation but you can do it in other ways:
rule rgi:
input:
faa = rules.prokka.output.faa,
cardDb = config['rgi']['cardDb'],
output:
cardTxt = "{sampleID}/annotation/rgi/{sampleID}.txt",
shell:
"""
card=${{{output.cardTxt}%.txt}}
rgi load -i {input.cardDb}
rgi main -i {input.faa} -t protein -o $card --include_loose --clean
"""
(I assume you want .txt to be part of the output filename. I.e. you are ok with 7.txt)

NameError in Snakemake dryrun mode

I am new to Snakemake and I am trying to develop some pipelines. I am encountering some problems when I use wildcards, trying to automate my bioinformatic analyses as much as possible. I run into troubles when the pipeline becomes more complex (as shown below). It looks like Snakemake does not resolve the wildcards correctly. During a dry run of the Snakefile, the wildcards values look correct in the executions of some rules. However, the same wildcards lead to an error in a different step(rule) of the pipeline, and I cannot figure out why. Below I provide the code and the output message of a dry run.
num=["327905-LR-41624_normal","327907-LR-41624_tumor"]
num_normal=["327905-LR-41624"]
num_tumor=["327907-LR-41624"]
path="/path/to/Snakemake/"
genome="/path/to/references_genome/Mus_musculus.GRCm38.dna_rm.toplevel.fa"
rule all:
input:
expand("/path/to/Snakemake/AS-{num_tum}_tumor_no_dupl_sort_RG_LB.bam",num_tum=num_tumor),
expand("/path/to/Snakemake/AS-{num_norm}_normal_no_dupl_sort_RG_LB.bam",num_norm=num_normal)
ruleorder: samtools_sort > remove_duplicates > samtools_index #> add_readgroup_tumor > add_readgroup_normal
rule trim_galore:
input:
r1="/path/to/Snakemake/AS-{num}_R1.fastq",
r2="/path/to/Snakemake/AS-{num}_R2.fastq"
output:
"/path/to/Snakemake/AS-{num }_R1_val_1.fq",
"/path/to/Snakemake/AS-{num }_R2_val_2.fq"
shell:
"module load trim-galore/0.5.0 ; module load pypy/2.7-6.0.0 ; trim_galore --output_dir /path/to/Snakemake/ --paired {input.r1} {input.r2} "
rule bwa_mem:
input:
R1="/path/to/Snakemake/AS-{num}_R1_val_1.fq",
R2="/path/to/Snakemake/AS-{num}_R2_val_2.fq"
output:
"/path/to/Snakemake/AS-{num}.bam"
shell:
"module load samtools/default ; module load bwa/0.7.8 ; bwa mem {genome} {input.R1} {input.R2} | samtools view -h -b > {output} "
rule samtools_sort:
input:
"/path/to/Snakemake/AS-{num}.bam"
output:
"/path/to/Snakemake/AS-{num}_sort.bam"
shell:
"module load samtools/default ; samtools sort -n -O BAM {input} > {output} "
rule remove_duplicates:
input:
"/path/to/Snakemake/AS-{num}_sort.bam"
output:
outbam="/path/to/Snakemake/AS-{num}_no_dupl_sort.bam",
metrics="/path/to/Snakemake/AS-{num}_dupl_metrics.txt"
shell:
"module load gatk/4.0.9.0 ; gatk MarkDuplicates -I {input} -O {output.outbam} -M {output.metrics} --REMOVE_DUPLICATES=true "
rule samtools_index:
input:
"/path/to/Snakemake/AS-{num}_no_dupl_sort.bam"
output:
"/path/to/Snakemake/AS-{num}_no_dupl_sort.bam.bai"
shell:
"module load samtools/default ; samtools index {input} "
rule add_readgroup_normal:
input:
"/path/to/Snakemake/AS-{num_normal}_normal_no_dupl_sort.bam"
output:
"/path/to/Snakemake/AS-{num_normal}_normal_no_dupl_sort_RG_LB.bam"
shell:
"module load gatk/4.0.9.0 ; gatk AddOrReplaceReadGroups -PL Illumina -LB { num_normal } -PU { num_normal } -SM NORMAL -I { input } -O {output} "
rule add_readgroup_tumor:
input:
"/path/to/Snakemake/AS-{num_tumor}_tumor_no_dupl_sort.bam"
output:
"/path/to/Snakemake/AS-{num_tumor}_tumor_no_dupl_sort_RG_LB.bam"
shell:
"module load gatk/4.0.9.0 ; gatk AddOrReplaceReadGroups -PL Illumina -LB { num_tumor } -PU { num_tumor } -SM TUMOR -I { input } -O {output} "
When I test the Snakefile with the command:
.local/bin/snakemake -s Snakefile_pipeline --dryrun
I get the following:
**Building DAG of jobs...**
**Job counts:**
**count jobs
1 add_readgroup_normal
1 add_readgroup_tumor
1 all
2 bwa_mem
2 remove_duplicates
2 samtools_sort
2 trim_galore
11**
**[Mon Apr 8 16:14:27 2019]
rule trim_galore:
input: /path/to/Snakemake/AS-327907-LR-41624_tumor_R1.fastq, /path/to/Snakemake/AS-327907-LR-41624_tumor_R2.fastq
output: /path/to/Snakemake/AS-327907-LR-41624_tumor_R1_val_1.fq, /path/to/Snakemake/AS-327907-LR-41624_tumor_R2_val_2.fq
jobid: 9
wildcards: num=327907-LR-41624_tumor**
**[Mon Apr 8 16:14:27 2019]
rule trim_galore:
input: /path/to/Snakemake/AS-327905-LR-41624_normal_R1.fastq, /path/to/Snakemake/AS-327905-LR-41624_normal_R2.fastq
output: /path/to/Snakemake/AS-327905-LR-41624_normal_R1_val_1.fq, /path/to/Snakemake/AS-327905-LR-41624_normal_R2_val_2.fq
jobid: 10
wildcards: num=327905-LR-41624_normal**
**[Mon Apr 8 16:14:27 2019]
rule bwa_mem:
input: /path/to/Snakemake/AS-327905-LR-41624_normal_R1_val_1.fq, /path/to/Snakemake/AS-327905-LR-41624_normal_R2_val_2.fq
output: /path/to/Snakemake/AS-327905-LR-41624_normal.bam
jobid: 8
wildcards: num=327905-LR-41624_normal**
**[Mon Apr 8 16:14:27 2019]
rule bwa_mem:
input: /path/to/Snakemake/AS-327907-LR-41624_tumor_R1_val_1.fq, /path/to/Snakemake/AS-327907-LR-41624_tumor_R2_val_2.fq
output: /path/to/Snakemake/AS-327907-LR-41624_tumor.bam
jobid: 7
wildcards: num=327907-LR-41624_tumor**
**[Mon Apr 8 16:14:27 2019]
rule samtools_sort:
input: /path/to/Snakemake/AS-327907-LR-41624_tumor.bam
output: /path/to/Snakemake/AS-327907-LR-41624_tumor_sort.bam
jobid: 5
wildcards: num=327907-LR-41624_tumor**
**[Mon Apr 8 16:14:27 2019]
rule samtools_sort:
input: /path/to/Snakemake/AS-327905-LR-41624_normal.bam
output: /path/to/Snakemake/AS-327905-LR-41624_normal_sort.bam
jobid: 6
wildcards: num=327905-LR-41624_normal**
**[Mon Apr 8 16:14:27 2019]
rule remove_duplicates:
input: /path/to/Snakemake/AS-327907-LR-41624_tumor_sort.bam
output: /path/to/Snakemake/AS-327907-LR-41624_tumor_no_dupl_sort.bam, /path/to/Snakemake/AS-327907-LR-41624_tumor_dupl_metrics.txt
jobid: 3
wildcards: num=327907-LR-41624_tumor**
**[Mon Apr 8 16:14:27 2019]
rule remove_duplicates:
input: /path/to/Snakemake/AS-327905-LR-41624_normal_sort.bam
output: /path/to/Snakemake/AS-327905-LR-41624_normal_no_dupl_sort.bam, /path/to/Snakemake/AS-327905-LR-41624_normal_dupl_metrics.txt
jobid: 4
wildcards: num=327905-LR-41624_normal**
**[Mon Apr 8 16:14:27 2019]
rule add_readgroup_normal:
input: /path/to/Snakemake/AS-327905-LR-41624_normal_no_dupl_sort.bam
output: /path/to/Snakemake/AS-327905-LR-41624_normal_no_dupl_sort_RG_LB.bam
jobid: 2
wildcards: num_normal=327905-LR-41624**
**RuleException in line 93 of /home/l136n/Snakefile_mapping_snv_call_pipeline2:
NameError: The name ' num_normal ' is unknown in this context. Please make sure that you defined that variable. Also note that braces not used for variable access have to be escaped by repeating them, i.e. {{print $1}}**
I have googled the error but found little help. Also, I double checked the pipeline for any incosistency. What I expect as output is indicated in the rule "all". The rules "add_readgroup_normal" and "add_readgroup_tumor" are supposed to take different subsets of input files, generated by the previous steps, which are run on all files. I wonder if the problem arises somehow because of this separation into 2 subsets.
I repeat that I am quite new to Snakemake, so I might be missing something silly somewhere! Any help would be really appreciated, as I am completely stuck!
Thank you so much in advance!
num=["327905-LR-41624_normal","327907-LR-41624_tumor"]
normal=["327905-LR-41624_normal"]
num_tumor=["327907-LR-41624_tumor"]
path="/path/to/Snakemake/"
genome="/icgc/dkfzlsdf/analysis/B210/references_genome/Mus_musculus.GRCm38.dna_rm.toplevel.fa"
rule all:
input:
"/path/to/Snakemake/AS-327905-LR-41624_normal_R1_val_1.fq",
"/path/to/Snakemake/AS-327905-LR-41624_normal_R2_val_2.fq",
"/path/to/Snakemake/AS-327907-LR-41624_tumor_R1_val_1.fq",
"/path/to/Snakemake/AS-327907-LR-41624_tumor_R2_val_2.fq",
"/path/to/Snakemake/AS-327905-LR-41624_normal_no_dupl_sort.bam.bai",
"/path/to/Snakemake/AS-327907-LR-41624_tumor_no_dupl_sort.bam.bai",
"/path/to/Snakemake/AS-327905-LR-41624_normal_RG.bam"
"/path/to/Snakemake/AS-327907-LR-41624_tumor_RG.bam"
rule trim_galore:
input:
r1="/path/to/Snakemake/AS-{num}_R1.fastq",
r2="/path/to/Snakemake/AS-{num}_R2.fastq"
output:
"/path/to/Snakemake/AS-{num }_R1_val_1.fq",
"/path/to/Snakemake/AS-{num }_R2_val_2.fq"
shell:
"module load trim-galore/0.5.0 ; module load pypy/2.7-6.0.0 ; trim_galore --output_dir /path/to/Snakemake/ --paired {input.r1} {input.r2} "
rule bwa_mem:
input:
R1="/path/to/Snakemake/AS-{num}_R1_val_1.fq",
R2="/path/to/Snakemake/AS-{num}_R2_val_2.fq"
output:
"/path/to/Snakemake/AS-{num}.bam"
shell:
"module load samtools/default ; module load bwa/0.7.8 ; bwa mem {genome} {input.R1} {input.R2} | samtools view -h -b > {output} "
rule samtools_sort:
input:
"/path/to/Snakemake/AS-{num}.bam"
output:
"/path/to/Snakemake/AS-{num}_sort.bam"
shell:
"module load samtools/default ; samtools sort -n -O BAM {input} > {output} "
rule remove_duplicates:
input:
"/path/to/Snakemake/AS-{num}_sort.bam"
output:
outbam="/path/to/Snakemake/AS-{num}_no_dupl_sort.bam",
metrics="/path/to/Snakemake/AS-{num}_dupl_metrics.txt"
shell:
"module load gatk/4.0.9.0 ; gatk MarkDuplicates -I {input} -O {output.outbam} -M {output.metrics} --REMOVE_DUPLICATES=true "
rule samtools_index:
input:
"/path/to/Snakemake/AS-{num}_no_dupl_sort.bam"
output:
"/path/to/Snakemake/AS-{num}_no_dupl_sort.bam.bai"
shell:
"module load samtools/default ; samtools index {input} "
rule add_readgroup_normal:
input:
"/path/to/Snakemake/AS-{normal}_no_dupl_sort.bam"
output:
"/path/to/Snakemake/AS-{normal}_RG.bam"
shell:
"module load gatk/4.0.9.0 ; gatk AddOrReplaceReadGroups -PL Illumina -LB { wildcards.normal } -PU { wildcards.normal } -SM NORMAL -I { input } -O {output} "
rule add_readgroup_tumor:
input:
"/path/to/Snakemake/AS-{num}_no_dupl_sort.bam"
output:
"/path/to/Snakemake/AS-{num_,'.*tumor.*'}_RG.bam"
shell:
"module load gatk/4.0.9.0 ; gatk AddOrReplaceReadGroups -PL Illumina -LB { wildcards.num } -PU { wildcards.num } -SM TUMOR -I { input } -O {output} "
Error:
Building DAG of jobs...
MissingInputException in line 37 of /home/l136n/Snakefile_mapping_snv_call_pipeline2b1:
Missing input files for rule trim_galore:
/path/to/Luca/Snakemake/AS-327905-LR-41624_normal_RG.bam/path/to/Luca/Snakemake/AS-327907-LR-41624_tumor_RG_R1.fastq
/path/to/Snakemake/AS-327905-LR-41624_normal_RG.bam/path/to/Luca/Snakemake/AS-327907-LR-41624_tumor_RG_R2.fastq
Wildcards are accessible in shell using syntax {wilcards.var}, not {var}. You have the latter in rule add_readgroup_normal.
Source.
I thought I would provide the solution, even if the post is a bit old now. The error was simply due to the presence of spaces inside "{ wildcards.var }".

Use wildcard on params

I try to use one tool and I need to use a wildcard present on input.
This is an example:
aDict = {"120":"121" } #tumor : normal
rule all:
input: expand("{case}.mutect2.vcf",case=aDict.keys())
def get_files_somatic(wildcards):
case = wildcards.case
control = aDict[wildcards.case]
return [case + ".sorted.bam", control + ".sorted.bam"]
rule gatk_Mutect2:
input:
get_files_somatic,
output:
"{case}.mutect2.vcf"
params:
genome="ref/hg19.fa",
target= "chr12",
name_tumor='{case}'
log:
"logs/{case}.mutect2.log"
threads: 8
shell:
" gatk-launch Mutect2 -R {params.genome} -I {input[0]} -tumor {params.name_tumor} -I {input[1]} -normal {wildcards.control}"
" -L {params.target} -O {output}"
I Have this error:
'Wildcards' object has no attribute 'control'
So I have a function with case and control. I'm not able to extract code.
The wildcards are derived from the output file/pattern. That is why you only have the wildcard called case. You have to derive the control from that. Try replacing your shell statement with this:
run:
control = aDict[wildcards.case]
shell(
"gatk-launch Mutect2 -R {params.genome} -I {input[0]} "
"-tumor {params.name_tumor} -I {input[1]} -normal {control} "
"-L {input.target2} -O {output}"
)
You could define control in params. Also {input.target2} in shell command would result in error. May be it's supposed to be params.target?
rule gatk_Mutect2:
input:
get_files_somatic,
output:
"{case}.mutect2.vcf"
params:
genome="ref/hg19.fa",
target= "chr12",
name_tumor='{case}',
control = lambda wildcards: aDict[wildcards.case]
shell:
"""
gatk-launch Mutect2 -R {params.genome} -I {input[0]} -tumor {params.name_tumor} \\
-I {input[1]} -normal {params.control} -L {params.target} -O {output}
"""

How can I write multiple lines in expect program for the spawn command?

I have written this little script for getting multiple files from my remote server to my host computer:
#! /usr/bin/expect -f
spawn scp \
user#remote:/home/user/{A.txt,B.txt} \
/home/user_local/Documents
expect "password: "
send "somesecretpwd\r"
interact
This is working fine, but when I want to make new lines between the files like this:
user#remote:/home/user/{A.txt,\
B.txt} \
I am getting the following error(s):
scp: /home/user/{A.txt,: No such file or directory
scp: B.txt}: No such file or directory
I tried this:
user#remote:"/home/user/{A.txt,\
B.txt}" \
getting:
bash: -c: line 0: unexpected EOF while looking for matching `"'
bash: -c: line 1: syntax error: unexpected end of file
cp: cannot stat 'B.txt}"': No such file or directory
or this:
"user#remote:/home/user/{A.txt,\
B.txt}" \
getting the same error at the beginning.
How can I write the files in multiple lines but so that the program is working correctly? I need this for a better readability of the choosen files.
Edit:
Only changed the local user name to user_local
In Tcl (and so Expect) \<NEWLINE><SPACEs> will be converted into one single <SPACE> so you cannot write a string containing no spaces into multiple lines.
% puts "abc\
def"
abc def
% puts {abc\
def}
abc def
%
Assuming the filenames are really longer (not much point otherwise) you could use a couple of variables like this:
#! /usr/bin/expect -f
set A A.txt
set B B.txt
spawn scp \
user#remote:/home/user/{$A,$B} \
/home/user/Documents
expect "password: "
send "somesecretpwd"
interact
For anyone who want to solve a similar problem with only using expect:
You can write a list of files and then concat all files to one string.
Here is the code:
#! /usr/bin/expect -f
set files {\ # a list of files
A.txt\
B.txt\
C.txt\
}
# will return the concatenated string with all files
# in this example it would be: A.txt,B.txt,C.txt
set concat [join $files ,]
# self made version of concat
# set concat [lindex $files 0] # get the first file
# set last_idx [expr {[llength $files]-1}] # calc the last index from the list
# set rest_files [lrange $files 1 $last_idx] # get other files
# foreach file $rest_files {
# set concat $concat,$file # append the concat varibale with a comma and the other file
# }
# # puts "$concat" # only for testing the output
spawn scp \
user#remote:/home/doublepmcl/{$concat} \
/home/user_local/Documents
expect "password: "
send "somesecretpwd\r"
interact

Resources