Windows Vista Forums
Vista Forums Home Join Vista Forums Webcasts Windows 7 Forum Vista Tutorials Tags

Welcome to Vista Forums we are your forum to discuss Windows Vista x64 and x86 systems. Whether you need help or just want to post an idea you have on Vista, this is the forum for you.
Register at Vista forums...the world biggest Windows Vista resource Join Vista Forums Now

Go Back   Vista Forums > Microsoft Technical Newsgroups > PowerShell

Merging/Joinging two files

Update your Vista Drivers Update Your Drivers Now!!
Closed Thread
 
Thread Tools Display Modes
Old 03-09-2007   #1 (permalink)
marc@carl39.freeserve.co.uk
Guest


 

Merging/Joinging two files

Hello
I was wondering if it was possible to use PS to join or merge two
files.
I want to merge files that have a common column:
Example:
genes.txt:
gene1<tab>kinse_a
gene2<tab>kinse_b

and
data.txt:
gene1<tab>+8.6
gene2<tab>-8.4

I have been using cygwin on windows and it seems a bit flakey using
the join and awk cammands. So i want to use PS since i have it
installed and haven't done anything with it.
Any help would be great.

Cheers

Mark


My System SpecsSystem Spec
Old 03-09-2007   #2 (permalink)
patrick.harzheim@gmail.com
Guest


 

Re: Merging/Joinging two files

On Mar 9, 11:24 am, m...@carl39.freeserve.co.uk wrote:
> Hello
> I was wondering if it was possible to use PS to join or merge two
> files.
> I want to merge files that have a common column:
> Example:
> genes.txt:
> gene1<tab>kinse_a
> gene2<tab>kinse_b
>
> and
> data.txt:
> gene1<tab>+8.6
> gene2<tab>-8.4
>
> I have been using cygwin on windows and it seems a bit flakey using
> the join and awk cammands. So i want to use PS since i have it
> installed and haven't done anything with it.
> Any help would be great.
>
> Cheers
>
> Mark



Joined on Gene*...somewhat?!
PS _> $g | foreach {$gi = $_; $d | foreach{ $di = $_; if( $di.Gen -eq
$gi.Gen){ [st
ing]$di + [string]$gi } }}
@{Gen=gene1; Decimal=+8.6}@{Gen=gene1; name=kinse_a}
@{Gen=gene2; Decimal=-8.4}@{Gen=gene2; name=kinse_b}
@{Gen=; Decimal=}@{Gen=; name=}


You'll need to do this first.

PS _> gc genes.txt
Gen,name
gene1,kinse_a
gene2,kinse_b

PS _> $g = import-csv genes.txt

My System SpecsSystem Spec
Old 03-09-2007   #3 (permalink)
George Davis
Guest


 

RE: Merging/Joinging two files

Are there always the same number of entries in the 2 files and if so, will
the entries always "line up", meaning line 50 in the "genes.txt" file
correlate to line 50 in the "data.txt" file?

George


"marc@carl39.freeserve.co.uk" wrote:

> Hello
> I was wondering if it was possible to use PS to join or merge two
> files.
> I want to merge files that have a common column:
> Example:
> genes.txt:
> gene1<tab>kinse_a
> gene2<tab>kinse_b
>
> and
> data.txt:
> gene1<tab>+8.6
> gene2<tab>-8.4
>
> I have been using cygwin on windows and it seems a bit flakey using
> the join and awk cammands. So i want to use PS since i have it
> installed and haven't done anything with it.
> Any help would be great.
>
> Cheers
>
> Mark
>
>

My System SpecsSystem Spec
Old 03-10-2007   #4 (permalink)
marc@carl39.freeserve.co.uk
Guest


 

Re: Merging/Joinging two files

On 9 Mar, 19:06, patrick.harzh...@gmail.com wrote:
> On Mar 9, 11:24 am, m...@carl39.freeserve.co.uk wrote:
>
> > Hello
> > I was wondering if it was possible to use PS to join or merge two
> > files.
> > I want to merge files that have a common column:
> > Example:
> > genes.txt:
> > gene1<tab>kinse_a
> > gene2<tab>kinse_b

>
> > and
> > data.txt:
> > gene1<tab>+8.6
> > gene2<tab>-8.4

>
> > I have been using cygwin on windows and it seems a bit flakey using
> > the join and awk cammands. So i want to use PS since i have it
> > installed and haven't done anything with it.
> > Any help would be great.

>
> > Cheers

>
> > Mark

>
> Joined on Gene*...somewhat?!
> PS _> $g | foreach {$gi = $_; $d | foreach{ $di = $_; if( $di.Gen -eq
> $gi.Gen){ [st
> ing]$di + [string]$gi } }}
> @{Gen=gene1; Decimal=+8.6}@{Gen=gene1; name=kinse_a}
> @{Gen=gene2; Decimal=-8.4}@{Gen=gene2; name=kinse_b}
> @{Gen=; Decimal=}@{Gen=; name=}
>
> You'll need to do this first.
>
> PS _> gc genes.txt
> Gen,name
> gene1,kinse_a
> gene2,kinse_b
>
> PS _> $g = import-csv genes.txt


Hiya thanks for the post.

Any chance you could comment this code a tad?
I have done the last part ok [get-content and import-csv]
But the first part is a bit complicated to understand, as ai said I'm
new to to PS.

Cheers

Mark

My System SpecsSystem Spec
Old 03-10-2007   #5 (permalink)
marc@carl39.freeserve.co.uk
Guest


 

Re: Merging/Joinging two files

On 9 Mar, 19:49, George Davis <GeorgeDa...@discussions.microsoft.com>
wrote:
> Are there always the same number of entries in the 2 files and if so, will
> the entries always "line up", meaning line 50 in the "genes.txt" file
> correlate to line 50 in the "data.txt" file?
>
> George
>
>
>
> "m...@carl39.freeserve.co.uk" wrote:
> > Hello
> > I was wondering if it was possible to use PS to join or merge two
> > files.
> > I want to merge files that have a common column:
> > Example:
> > genes.txt:
> > gene1<tab>kinse_a
> > gene2<tab>kinse_b

>
> > and
> > data.txt:
> > gene1<tab>+8.6
> > gene2<tab>-8.4

>
> > I have been using cygwin on windows and it seems a bit flakey using
> > the join and awk cammands. So i want to use PS since i have it
> > installed and haven't done anything with it.
> > Any help would be great.

>
> > Cheers

>
> > Mark- Hide quoted text -

>
> - Show quoted text -


Hi George,

Yes the two files will have the same number of lines - they are
basically input and output files from microarray experiment (if that
is of any interest/importance).
Thanks

Mark

My System SpecsSystem Spec
Old 03-11-2007   #6 (permalink)
Flowering Weeds
Guest


 

Re: Merging/Joinging two files


> I was wondering if it was possible
> to use PS to join or merge two files.
> I want to merge files that have a
> common column:


Perhaps Microsoft's Log Parser 2.2

PS> LogParser -h -i:tsv genes.txt -headerRowff

Input format: TSV (TSV Format)
Parses text files containing tab- or space- separated values
(...)
Fields:
Filename (S) RowNumber (I) Field1 (S) Field2 (S)

Parse (search) on field1 in both files
and put the output into both.txt file!



My System SpecsSystem Spec
Old 03-11-2007   #7 (permalink)
patrick.harzheim@gmail.com
Guest


 

Re: Merging/Joinging two files

On Mar 10, 7:42 am, m...@carl39.freeserve.co.uk wrote:
> On 9 Mar, 19:49, George Davis <GeorgeDa...@discussions.microsoft.com>
> wrote:
>
>
>
>
>
> > Are there always the same number of entries in the 2 files and if so, will
> > the entries always "line up", meaning line 50 in the "genes.txt" file
> > correlate to line 50 in the "data.txt" file?

>
> > George

>
> > "m...@carl39.freeserve.co.uk" wrote:
> > > Hello
> > > I was wondering if it was possible to use PS to join or merge two
> > > files.
> > > I want to merge files that have a common column:
> > > Example:
> > > genes.txt:
> > > gene1<tab>kinse_a
> > > gene2<tab>kinse_b

>
> > > and
> > > data.txt:
> > > gene1<tab>+8.6
> > > gene2<tab>-8.4

>
> > > I have been using cygwin on windows and it seems a bit flakey using
> > > the join and awk cammands. So i want to use PS since i have it
> > > installed and haven't done anything with it.
> > > Any help would be great.

>
> > > Cheers

>
> > > Mark- Hide quoted text -

>
> > - Show quoted text -

>
> Hi George,
>
> Yes the two files will have the same number of lines - they are
> basically input and output files from microarray experiment (if that
> is of any interest/importance).
> Thanks
>
> Mark- Hide quoted text -
>
> - Show quoted text -


You specified: "I want to merge files that have a common column:"
You might want to merge by row number too, right?

My System SpecsSystem Spec
Old 03-11-2007   #8 (permalink)
patrick.harzheim@gmail.com
Guest


 

Re: Merging/Joinging two files

On Mar 10, 7:10 am, m...@carl39.freeserve.co.uk wrote:
> On 9 Mar, 19:06, patrick.harzh...@gmail.com wrote:
>
>
>
>
>
> > On Mar 9, 11:24 am, m...@carl39.freeserve.co.uk wrote:

>
> > > Hello
> > > I was wondering if it was possible to use PS to join or merge two
> > > files.
> > > I want to merge files that have a common column:
> > > Example:
> > > genes.txt:
> > > gene1<tab>kinse_a
> > > gene2<tab>kinse_b

>
> > > and
> > > data.txt:
> > > gene1<tab>+8.6
> > > gene2<tab>-8.4

>
> > > I have been using cygwin on windows and it seems a bit flakey using
> > > the join and awk cammands. So i want to use PS since i have it
> > > installed and haven't done anything with it.
> > > Any help would be great.

>
> > > Cheers

>
> > > Mark

>
> > Joined on Gene*...somewhat?!
> > PS _> $g | foreach {$gi = $_; $d | foreach{ $di = $_; if( $di.Gen -eq
> > $gi.Gen){ [st
> > ing]$di + [string]$gi } }}
> > @{Gen=gene1; Decimal=+8.6}@{Gen=gene1; name=kinse_a}
> > @{Gen=gene2; Decimal=-8.4}@{Gen=gene2; name=kinse_b}
> > @{Gen=; Decimal=}@{Gen=; name=}

>
> > You'll need to do this first.

>
> > PS _> gc genes.txt
> > Gen,name
> > gene1,kinse_a
> > gene2,kinse_b

>
> > PS _> $g = import-csv genes.txt

>
> Hiya thanks for the post.
>
> Any chance you could comment this code a tad?
> I have done the last part ok [get-content and import-csv]
> But the first part is a bit complicated to understand, as ai said I'm
> new to to PS.
>
> Cheers
>
> Mark- Hide quoted text -
>
> - Show quoted text -


What's the matter can't "sort" things out? )

//Do the same for Data.txt and Genes.txt
PS _> gc genes.txt
Gen,name
gene1,kinse_a
gene2,kinse_b

//Resulting
PS _> $g = import-csv genes.txt

//Line by line,
PS _> foreach ( $gen in $g)
>{
>foreach ($dat in $d )
>{
>if ($gen.Gen -eq $dat.Gen)
>{
>$gen.Gen + $gen.Name + $dat.decimal
>}
>}
>}


My System SpecsSystem Spec
Old 03-12-2007   #9 (permalink)
George Davis
Guest


 

Re: Merging/Joinging two files

Marc,
Here is a script that I got working to do what you want. There may be a more
elegant way to do this in PS but I haven't found it.

The basic idea is to use a foreach loop to loop through the genes.txt file
and while going through that file, get the corrosponding entry from the
data.txt file.

Let me know if you have any problems getting it to work. I made a file of
35,000 entries and it was processing about 100 records per second on my
laptop, which is an IBM T42 with 2 gigs of memory. The "Write-Host" lines at
the top of the file and printing out the value of $i at the end of the loop
are just so you can see the progress speed:

# build an array of entries from the
# genes.txt file
$genes = Get-Content c:\genes.txt

Write-Host "Genes file read"

# build an array of entries from the
# data.txt file
$data = Get-Content c:\data.txt

Write-Host "Both files read"

Clear-Host

[int] $i = 0
# loop through the genes array, parse
# each line, get matching entry from
# data array and put together into a file
foreach($gene in $genes)
{
# split the current line from the genes.txt
# file into an array of 2 items using
# the tab character as the delimiter
$oneLineGenes = $gene.Split("`t")

# assign each of the 2 pieces of gene info
# to a variable

# $geneNumber would be "gene1" in first line
$geneNumber = $oneLineGenes[0]

# $geneName would be "kinse_a" in first line
$geneName = $oneLineGenes[1]

# $data[$i] retrieves the correct line number from
# the data.txt file.
# split the current line from the data.txt
# file into an array of 2 elements using
# the tab character as the delimiter
$oneLineData = $data[$i].Split("`t")

# $geneDataValue would be "+8.6" in first line
$geneDataValue = $oneLineData[1]

# build a variable that will look like this for the
# first line in both files:
# "gene1,kinse_a,+8.6"
$lineToWrite = $geneNumber + "," + $geneName + "," + $geneDataValue

# now write that variable out to a file
$lineToWrite | out-file c:\out.txt -append

# increment $i so that we always retrieve the
# correct line number from the data.txt file
$i++

# for performance testing - comment out for production
if( ($i % 100) -eq 0 )
{
$i
}
}

Hope this helps,
George

"marc@carl39.freeserve.co.uk" wrote:

> On 9 Mar, 19:49, George Davis <GeorgeDa...@discussions.microsoft.com>
> wrote:
> > Are there always the same number of entries in the 2 files and if so, will
> > the entries always "line up", meaning line 50 in the "genes.txt" file
> > correlate to line 50 in the "data.txt" file?
> >
> > George
> >
> >
> >
> > "m...@carl39.freeserve.co.uk" wrote:
> > > Hello
> > > I was wondering if it was possible to use PS to join or merge two
> > > files.
> > > I want to merge files that have a common column:
> > > Example:
> > > genes.txt:
> > > gene1<tab>kinse_a
> > > gene2<tab>kinse_b

> >
> > > and
> > > data.txt:
> > > gene1<tab>+8.6
> > > gene2<tab>-8.4

> >
> > > I have been using cygwin on windows and it seems a bit flakey using
> > > the join and awk cammands. So i want to use PS since i have it
> > > installed and haven't done anything with it.
> > > Any help would be great.

> >
> > > Cheers

> >
> > > Mark- Hide quoted text -

> >
> > - Show quoted text -

>
> Hi George,
>
> Yes the two files will have the same number of lines - they are
> basically input and output files from microarray experiment (if that
> is of any interest/importance).
> Thanks
>
> Mark
>
>

My System SpecsSystem Spec
Old 03-14-2007   #10 (permalink)
mark carlile
Guest


 

Re: Merging/Joinging two files

On 12 Mar, 12:45, George Davis <GeorgeDa...@discussions.microsoft.com>
wrote:
> Marc,
> Here is a script that I got working to do what you want. There may be a more
> elegant way to do this in PS but I haven't found it.
>
> The basic idea is to use a foreach loop to loop through the genes.txt file
> and while going through that file, get the corrosponding entry from the
> data.txt file.
>
> Let me know if you have any problems getting it to work. I made a file of
> 35,000 entries and it was processing about 100 records per second on my
> laptop, which is an IBM T42 with 2 gigs of memory. The "Write-Host" lines at
> the top of the file and printing out the value of $i at the end of the loop
> are just so you can see the progress speed:
>
> # build an array of entries from the
> # genes.txt file
> $genes = Get-Content c:\genes.txt
>
> Write-Host "Genes file read"
>
> # build an array of entries from the
> # data.txt file
> $data = Get-Content c:\data.txt
>
> Write-Host "Both files read"
>
> Clear-Host
>
> [int] $i = 0
> # loop through the genes array, parse
> # each line, get matching entry from
> # data array and put together into a file
> foreach($gene in $genes)
> {
> # split the current line from the genes.txt
> # file into an array of 2 items using
> # the tab character as the delimiter
> $oneLineGenes = $gene.Split("`t")
>
> # assign each of the 2 pieces of gene info
> # to a variable
>
> # $geneNumber would be "gene1" in first line
> $geneNumber = $oneLineGenes[0]
>
> # $geneName would be "kinse_a" in first line
> $geneName = $oneLineGenes[1]
>
> # $data[$i] retrieves the correct line number from
> # the data.txt file.
> # split the current line from the data.txt
> # file into an array of 2 elements using
> # the tab character as the delimiter
> $oneLineData = $data[$i].Split("`t")
>
> # $geneDataValue would be "+8.6" in first line
> $geneDataValue = $oneLineData[1]
>
> # build a variable that will look like this for the
> # first line in both files:
> # "gene1,kinse_a,+8.6"
> $lineToWrite = $geneNumber + "," + $geneName + "," + $geneDataValue
>
> # now write that variable out to a file
> $lineToWrite | out-file c:\out.txt -append
>
> # increment $i so that we always retrieve the
> # correct line number from the data.txt file
> $i++
>
> # for performance testing - comment out for production
> if( ($i % 100) -eq 0 )
> {
> $i
> }
>
> }
>
> Hope this helps,
> George
>
>
>
> "m...@carl39.freeserve.co.uk" wrote:
> > On 9 Mar, 19:49, George Davis <GeorgeDa...@discussions.microsoft.com>
> > wrote:
> > > Are there always the same number of entries in the 2 files and if so, will
> > > the entries always "line up", meaning line 50 in the "genes.txt" file
> > > correlate to line 50 in the "data.txt" file?

>
> > > George

>
> > > "m...@carl39.freeserve.co.uk" wrote:
> > > > Hello
> > > > I was wondering if it was possible to use PS to join or merge two
> > > > files.
> > > > I want to merge files that have a common column:
> > > > Example:
> > > > genes.txt:
> > > > gene1<tab>kinse_a
> > > > gene2<tab>kinse_b

>
> > > > and
> > > > data.txt:
> > > > gene1<tab>+8.6
> > > > gene2<tab>-8.4

>
> > > > I have been using cygwin on windows and it seems a bit flakey using
> > > > the join and awk cammands. So i want to use PS since i have it
> > > > installed and haven't done anything with it.
> > > > Any help would be great.

>
> > > > Cheers

>
> > > > Mark- Hide quoted text -

>
> > > - Show quoted text -

>
> > Hi George,

>
> > Yes the two files will have the same number of lines - they are
> > basically input and output files from microarray experiment (if that
> > is of any interest/importance).
> > Thanks

>
> > Mark- Hide quoted text -

>
> - Show quoted text -


WoW!!!
George,
That's really neat (and elegant if I may say so).
It seems to work better than anyother way I have tried.

Cheers mate :-)

Mark

My System SpecsSystem Spec
Closed Thread

Thread Tools
Display Modes



Similar Threads
Thread Thread Starter Forum Replies Last Post
Merging Partitions into one hatehereyes General Discussion 2 05-28-2008 06:44 AM
Merging two drives into one? It Is Me Here Vista file management 14 05-06-2008 03:31 PM
Merging user account? Totaldental Vista account administration 0 12-03-2007 04:06 PM
merging partitions Olivier Marquet Vista installation & setup 10 02-20-2007 09:30 PM
Merging images Raghavendra Avalon 0 10-15-2006 12:55 AM


Vistax64.com is an independent web site and has not been authorized,
sponsored, or otherwise approved by Microsoft Corporation.
"Windows Vista", the Start Orb, and related materials are trademarks of Microsoft Corp.
© Designer Media 2005-2008

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51