View Single Post
Old 12-03-2007   #7 (permalink)
tomaszinc


 
 

Re: stdout redirection

Hello again,

thanks for all your answers.
I understand the choices which have been made in designing powershell and I
think the native unicode support and the object oriented view is great.
Powershell offers a lot of features which I missed in all shells I have
worked with (eg creating forms). In fact, I think it is not yet powerful
enough, in my opinion a really powerful shell should enable the user to
completely control the operating system including for example moving the
mouse on screen, sending signals and messages to all processes and
applications and so on. Powershell is on the best way to achieve this.
However, I still have great issues with how powershell handles the output of
command line utilities.
Take the following c application for example (lets call it ascii.exe from
now on):
-----------code-----------
#include<stdio.h>
main() {
for (int i = 0; i < 256; i++) {
printf("%c",i);
}
}
-----------code-----------
As you can see, all it does is printing all ascii characters including the
extended set to stdout. If you execute the following line under any shell of
your choice (well I tested it with cmd.exe, sh.exe, cygwin.exe, and native
unix sh,bash)

# ascii.exe > ascii_code.txt

then the resulting file will consist of all characters from 0x00 up to 0xff
without any spaces or newlines. Whether or not the representation on screen
is equal on the different systems or shells is not of interest, important is,
that the binary representation within the file exactly matches the output
generated by ascii.exe.
However, in powershell this is not the case, if I run the above command line
there, the resulting file will look like this (in hex now):

ff fe 00 00 01 00 02 00 03 00 ...
... 7f 00 c7 00 fc 00 e9 00 e2 00 ...
....92 25 93 25 02 25 24 25...

First, there is some prefix added (ff fe), then after every character a NUL
character is inserted as delim. The characters after 0x7f are completely
messed up, randomly I guess, at least I can't find a pattern.

If you now run the following line:

PS > ascii.exe | out-file -encoding ascii ascii_code.txt

then all characters up to 0x0f are right, but after that only '?' (0x3f)
follow. And at the end the newline characters 0x0d 0x0a are appended. Sure,
it makes perfect sense to ignore the extended character set when using ascii
encoding. The final newline, however certainly does not belong there. So no
luck again.

Now the next try:

PS > ascii.exe | out-file -encoding oem ascii_code.txt

Now we get close. All characters from 0x00 to 0xff are present in the right
order without any NULs or newline. However, again the useless 0x0d 0x0a is
appended. Again, no exact match of the output.

My next idea was to assign the command line output to a variable.

PS > $ascii = ascii.exe

But guess what, even the content of $ascii is not the same as the output of
ascii.exe (well, at least the output on screen is not)!

PS > ascii.exe
☺☻♥♦♣
♫☼►◄↕‼¶§▬↨↑↓→←∟↔▲▼ !"#$%&'()*+,-./0123456789:;
.... ...

PS > $ascii
☺☻♥♦
♂♀
♫☼►◄↕‼¶§▬↨↑↓→←∟↔▲▼ !"#$%&'()*+,-./0123456789:;
.... ...

So from my novice point of view there is no way at all to operate on the
data ascii.exe produces.

Seriously, you can't do that. A shell must not under no circumstances at all
mess around with data produced by any application. It must guarantee, that
you can take this data, write it to files or variables, send it over network
or whatever and always get the exact same binary representation of the
original output. There are uncountable tools that can take input from pipes
or files and produce output on stdout, they all won't work right on
powershell, at least, not if you want to use intermediate files or even
variables.

My specific problem is: I'm doing language processing which means, using
numerous perl, python and prolog scripts and some self written applications
which all operate on characters or more specifically on their hex
represantation. powershell always ruins the output of these scripts and apps.
This is unacceptable.

The problem here is, I guess, that powershell does more that it should, for
example passing everything to a formatter and "nicely" representing it on the
screen. My opinion on this is that a shell is not at all responsible for how
the data is represented on screen, the application is. This would only make
sense in a closed environment where you have complete control over what
applications are executed. With a shell, you don't have (or is it that what
you want?).
Take this statement:
Quote:
Quote:

>> Obvious when you send an array of strings down the pipe (where they're not >> CRLF terminated) and you redirect that to a file, you would want each string to >> correspond to a line in the file.
Are you sure of that? What if I don't want? Why are you making this decision
for me?

A solution could be, using the '>' operator to redirect the data in binary
form, without any formatters messing it up (if this is possible).

What I wanted to do is build a nice GUI that unifies all my scripts and apps
and allows me to do all the analysis (including some computations) without
having to run lots of command lines. But at the moment I see no way how this
could be possible, I don't even see a way how I can write a normal batch in
powershell that executes all the scripts and apps without ending up with
garbled and wrong data.

So seriously, you should REALLY reconsider what you are doing here, this
just won't work.

Regards
Tomas

"Keith Hill [MVP]" wrote:
Quote:

> "Kuma" <kumasan76@xxxxxx> wrote in message
> news:df04d430-83b5-4e64-a47e-7bcfa100d6f8@xxxxxx
Quote:

> > On Dec 3, 7:35 am, tomaszinc <tomasz...@xxxxxx>
> > wrote:
Quote:

> >> Hello everyone,
> >> I'm pretty new to powershell scripting so this might seem like a stupid
> >> question, excuse me for that. I already searched the forum and
> >> documentation
> >> but couldn't find a solution so I take my chances here.
> >>
> >> Basically all I want to do is to write the output of some command line
> >> utilities and python scripts into a text file. With cmd I used the '>'
> >> operator for that. However, the resulting file is messed up with spaces
> >> between each character. I searched the documentation and found info about
> >> the
> >> commands 'out-file' and 'set-content'. There are a number of different
> >> encodings which can be set. I tried every combination of {out-file,
> >> set-content} -encoding *, however, the outfile never matches the output
> >> generated on the console. Instead either spaces or new-lines are
> >> inserted.
> >> Since I use this output for further processing it is vital that the file
> >> exactly matches the output on stdout.
> >> Can anybody tell me how I can redirect the output on stdout to a file
> >> without powershell messing with the content?
> >>
> >> best regards and thanks a lot
> >
> > Take a look here:
> >
> > http://keithhill.spaces.live.com/blo...3A97!811.entry
> >
> > Excellent page for this topic.
>
> Thanks. This particular issue can be annoying. Obvious when you send an
> array of strings down the pipe (where they're not CRLF terminated) and you
> redirect that to a file, you would want each string to correspond to a line
> in the file. That requires PowerShell to append it's own CRLF to the end of
> each line even if the string already has a CRLF at the end e.g.:
>
> 126# "one`r`n","two`r`n","three" | Out-File .\output.txt -Encoding ascii
> 127# format-hex C:\Temp\output.txt
>
> Address: 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15
> 16 17 ASCII
> -------- -----------------------------------------------------------------------
> ------------------------
> 00000000 6F 6E 65 0D 0A 0D 0A 74 77 6F 0D 0A 0D 0A 74 68 72 65 65 0D 0A
> one....two....three..
>
> NOTE: Format-Hex is a PSCX cmdlet. I wonder if it would have made more
> sense to have PowerShell punt on adding CRLF when the string already ends in
> a CRLF. Hmmm. Anyway you can do what you want. Unfortunately it isn't as
> straight forward as you would hope:
>
> 128# $output = csc.exe
> 129# $text = [String]::Join([Environment]::NewLine, $output)
> 130# [System.IO.File]::WriteAllText("$pwd\foo.txt", $text)
>
> For what it's worth, if you use the WriteAllLines method in the File class,
> which takes an array of strings, you wind up with the exact same problem -
> an extra CRLF at the end.
>
> --
> Keith
>
My System SpecsSystem Spec