Windows Vista Forums

VBScript String Replace - Remove / Replace Characters in String

  1. #1


    dsoutter Guest

    VBScript String Replace - Remove / Replace Characters in String

    VBScript String Replace

    http://www.code-tips.com/2009/04/vbs...on-remove.html

    Remove or replace specific characters from a string. The article below
    provides a function in VBScript to remove or replace characters in a
    string.

    VBScript String Replace

    http://www.code-tips.com/2009/04/vbs...on-remove.html

    remove Illegal Characters from a string: VBScript String Replace

    http://www.code-tips.com/2009/04/vbs...on-remove.html

    VBScript replace characters in string.

      My System SpecsSystem Spec

  2.   


  3. #2


    dsoutter Guest

    Re: VBScript String Replace - Remove / Replace Characters in String

    http://groups.google.com/group/web-p...cc0e6307ccbce0

    On Mar 2, 4:11*pm, dsoutter <webmasterhub....@newsgroup> wrote:

    > VBScript String Replace
    >
    > http://www.code-tips.com/2009/04/vbs...function-remov...
    >
    > Remove or replace specific characters from a string. The article below
    > provides a function in VBScript to remove or replace characters in a
    > string.
    >
    > VBScript String Replace
    >
    > http://www.code-tips.com/2009/04/vbs...function-remov...
    >
    > remove Illegal Characters from a string: VBScript String Replace
    >
    > http://www.code-tips.com/2009/04/vbs...function-remov...
    >
    > VBScript replace characters in string.

      My System SpecsSystem Spec

  4. #3


    Al Dunbar Guest

    Re: VBScript String Replace - Remove / Replace Characters in String



    "dsoutter" <webmasterhub.net@newsgroup> wrote in message
    news:a49b3803-8e79-46eb-a8c2-61454f499ee7@newsgroup

    > http://groups.google.com/group/web-p...cc0e6307ccbce0
    >
    > On Mar 2, 4:11 pm, dsoutter <webmasterhub....@newsgroup> wrote:

    >> VBScript String Replace
    >>
    >> http://www.code-tips.com/2009/04/vbs...function-remov...
    >>
    >> Remove or replace specific characters from a string. The article below
    >> provides a function in VBScript to remove or replace characters in a
    >> string.
    >>
    >> VBScript String Replace
    >>
    >> http://www.code-tips.com/2009/04/vbs...function-remov...
    >>
    >> remove Illegal Characters from a string: VBScript String Replace
    >>
    >> http://www.code-tips.com/2009/04/vbs...function-remov...
    >>
    >> VBScript replace characters in string.
    >
    Here is how I would code this function if I ever needed such a thing:

    msgbox clean("C:\<test>&<done>")

    function clean (strtoclean)
    strtemp = strtoclean
    badchars =
    Array("?","/","\",":","*","""","<",">","","&","#","~","%","{","}","+","_",".")
    for each badchar in badchars
    select case badchar
    case "&": goodchar = " and "
    case ":": goodchar = "-"
    case else: goodchar = " "
    end select
    strtemp = replace( strtemp, badchar, goodchar )
    next
    clean = strtemp
    end function

    IMHO, this has the same result but the logic is somewhat simpler. What
    benefit would I get from switching from my version to yours?

    /Al



      My System SpecsSystem Spec

  5. #4


    dsoutter Guest

    Re: VBScript String Replace - Remove / Replace Characters in String

    On Mar 3, 3:16*pm, "Al Dunbar" <aland...@newsgroup> wrote:

    > "dsoutter" <webmasterhub....@newsgroup> wrote in message
    >
    > news:a49b3803-8e79-46eb-a8c2-61454f499ee7@newsgroup
    >
    >
    >
    >
    >>

    > > On Mar 2, 4:11 pm, dsoutter <webmasterhub....@newsgroup> wrote:

    > >> VBScript String Replace
    >>

    > >> Remove or replace specific characters from a string. The article below
    > >> provides a function in VBScript to remove or replace characters in a
    > >> string.
    >

    > >> VBScript String Replace
    >>

    > >> remove Illegal Characters from a string: VBScript String Replace
    >>

    > >> VBScript replace characters in string.
    >
    > Here is how I would code this function if I ever needed such a thing:
    >
    > * * msgbox clean("C:\<test>&<done>")
    >
    > * * function clean (strtoclean)
    > * * * * strtemp = strtoclean
    > * * * * badchars =
    > Array("?","/","\",":","*","""","<",">","","&","#","~","%","{","}","+","_","*.")
    > * * * * for each badchar in badchars
    > * * * * * * select case badchar
    > * * * * * * * * case "&": goodchar = " and "
    > * * * * * * * * case ":": goodchar = "-"
    > * * * * * * * * case else: goodchar = " "
    > * * * * * * end select
    > * * * * * * strtemp = replace( strtemp, badchar, goodchar )
    > * * * * next
    > * * * * clean = strtemp
    > * * end function
    >
    > IMHO, this has the same result but the logic is somewhat simpler. What
    > benefit would I get from switching from my version to yours?
    >
    > /Al- Hide quoted text -
    >
    > - Show quoted text -
    Hi Al, the logic is simpler as you are using the replace() function to
    perform the string replace, where the function provided takes the left
    and right parts of a string, either side of an illegal character. In
    many cases, your method would be more suitable mainly due to the
    simpler logic, especially when all instances of each character are to
    be processed in the same way.

    As the method provided parses the string character by character, you
    should have greater control over the output when more complex
    operations need to be performed, such as removing or replacing a
    character only if it within a specific context:

    Eg. replace "&" with " and " if padded with spaces or other specific
    character, or with a "+" if not
    "something & something else" would become "something and something
    else"
    "somethin&something else" would become "somethin+something else".

    Eg. replace ":" only if NOT part of a url:

    "the website is http://code-tips.com " would remain "the website is
    http://code-tips.com "
    "See Here: http://code-tips.com " would become "See Here http://code-tips.com
    "

    This would be achieved by either checking the previous 3-5 characters
    when a ":" is found to see if it is in the context of a url or not
    (http, https, ftp), or by checking the characters following the
    current ":" is "//" which would indicate that the semicolon is part of
    a url.

    This functionality has not been included in the function provided, but
    would be easy to implement, as the string is incrementally parsed and
    manipulated using a numeric string position value relative to the
    current position/character in the string.

    There may also be differences in performance between the two methods,
    as the function provided includes the code required to remove or
    replace each of the specified characters without calling the replace()
    function. I suspect that the replace function uses a similar approach
    to replace the specified characters so any difference in performance
    would be minimal, unless parsing a large string value. I haven't yet
    tested this for performance differences.

    Thanks

      My System SpecsSystem Spec

  6. #5


    WebmasterHub.net Guest

    Re: VBScript String Replace - Remove / Replace Characters in String

    On Mar 3, 3:16*pm, "Al Dunbar" <aland...@newsgroup> wrote:

    > "dsoutter" <webmasterhub....@newsgroup> wrote in message
    >
    > news:a49b3803-8e79-46eb-a8c2-61454f499ee7@newsgroup
    >
    >
    >
    >
    >>

    > > On Mar 2, 4:11 pm, dsoutter <webmasterhub....@newsgroup> wrote:

    > >> VBScript String Replace
    >>

    > >> Remove or replace specific characters from a string. The article below
    > >> provides a function in VBScript to remove or replace characters in a
    > >> string.
    >

    > >> VBScript String Replace
    >>

    > >> remove Illegal Characters from a string: VBScript String Replace
    >>

    > >> VBScript replace characters in string.
    >
    > Here is how I would code this function if I ever needed such a thing:
    >
    > * * msgbox clean("C:\<test>&<done>")
    >
    > * * function clean (strtoclean)
    > * * * * strtemp = strtoclean
    > * * * * badchars =
    > Array("?","/","\",":","*","""","<",">","","&","#","~","%","{","}","+","_","*.")
    > * * * * for each badchar in badchars
    > * * * * * * select case badchar
    > * * * * * * * * case "&": goodchar = " and "
    > * * * * * * * * case ":": goodchar = "-"
    > * * * * * * * * case else: goodchar = " "
    > * * * * * * end select
    > * * * * * * strtemp = replace( strtemp, badchar, goodchar )
    > * * * * next
    > * * * * clean = strtemp
    > * * end function
    >
    > IMHO, this has the same result but the logic is somewhat simpler. What
    > benefit would I get from switching from my version to yours?
    >
    > /Al- Hide quoted text -
    >
    > - Show quoted text -
    Hi Al, the logic is simpler as you are using the replace() function
    to
    perform the string replace, where the function provided takes the
    left
    and right parts of a string, either side of an illegal character. In
    many cases, your method would be more suitable mainly due to the
    simpler logic, especially when all instances of each character are to
    be processed in the same way.

    As the method provided parses the string character by character, you
    should have greater control over the output when more complex
    operations need to be performed, such as removing or replacing a
    character only if it within a specific context:


    Eg. replace "&" with " and " if padded with spaces or other specific
    character, or with a "+" if not
    "something & something else" would become "something and something
    else"
    "somethin&something else" would become "somethin+something else".


    Eg. replace ":" only if NOT part of a url:


    "the website is http://code-tips.com " would remain "the website is
    http://code-tips.com "
    "See Here: http://code-tips.com " would become "See Here http://code-tips.com
    "


    This would be achieved by either checking the previous 3-5 characters
    when a ":" is found to see if it is in the context of a url or not
    (http, https, ftp), or by checking the characters following the
    current ":" is "//" which would indicate that the semicolon is part
    of
    a url.


    This functionality has not been included in the function provided,
    but
    would be easy to implement, as the string is incrementally parsed and
    manipulated using a numeric string position value relative to the
    current position/character in the string.

    There may also be differences in performance between the two methods,
    as the function provided includes the code required to remove or
    replace each of the specified characters without calling the
    replace()
    function. I suspect that the replace function uses a similar
    approach
    to replace the specified characters so any difference in performance
    would be minimal, unless parsing a large string value. I haven't yet
    tested this for performance differences.

      My System SpecsSystem Spec

  7. #6


    mayayana Guest

    Re: VBScript String Replace - Remove / Replace Characters in String

    This looks like some kind of advertisement
    for a blog, but it's an interesting question.
    In compiled VB both of the foregoing methods
    would be extremely slow on large strings.
    The webpage sample is allocating a vast
    number of strings to do its job. As the strings
    get bigger it would slow to a crawl. The Replace
    function looks much better to me, but it's also
    fairly slow. (Replace itself is slow.)

    Probably none of that matters if the function
    is only being used for filename strings of 20+-
    characters. And it's not easy to optimize for
    speed in VBS anyway. But personally I'd still much
    prefer your Replace loop. I don't see the sense of
    writing a highly inefficient Replace method in
    VBS when the scripting runtime can do it internally.

    But in general, why not tokenize? In compiled
    code that should be by far the fastest, with much
    greater speed achieved if the characters can be
    treated as numbers in an array so that the operation
    is not allocating new strings or deciphering the Chr
    value of each stored numeric value of the string.
    In VBS, I don't know whether treating characters as
    numbers will help, since it's still a variant that has
    to be "parsed". I haven't tested the possibilities.
    But I'm using numeric conversion below. I figured that
    it should be a little faster than having the function
    need to do a string comparison. (In a Select Case
    where the character is not an "illegal" there would be
    20-30 string comparisons happening if one uses the
    string version.)

    Another adsvantage of tokenizing is flexibility.
    There can be dozens of Case declares with very
    little cost.

    ' Note: I just wrote this as an "air code" sample.
    ' I didn't bother to get all of the ascii values since
    ' it's just a demo.

    Function Clean(sIn)
    Dim i2, iChar, A1()

    ReDim A1(len(sIn) - 1)
    For i2 = 1 to Len(sIn)
    iChar = Asc(Mid(sIn, i2, 1))
    Select Case iChar
    Case 63, 47, 92, 58, 42, 60, 62, 44, 46, 43, 126
    A1(i2 - 1) = "-"
    Case Else
    A1(i2 - 1) = Chr(iChar)
    End Select
    Next
    Clean = Join(A1, "")
    End Function



      My System SpecsSystem Spec

  8. #7


    James Guest

    Re: VBScript String Replace - Remove / Replace Characters in String

    On Mar 5, 1:59*am, "mayayana" <mayay...@newsgroup> wrote:

    > * This looks like some kind of advertisement
    > for a blog, but it's an interesting question.
    > In compiled VB both of the foregoing methods
    > would be extremely slow on large strings.
    > The webpage sample is allocating a vast
    > number of strings to do its job. As the strings
    > get bigger it would slow to a crawl. The Replace
    > function looks much better to me, but it's also
    > fairly slow. (Replace itself is slow.)
    >
    > * *Probably none of that matters if the function
    > is only being used for filename strings of 20+-
    > characters. And it's not easy to optimize for
    > speed in VBS anyway. But personally I'd still much
    > prefer your Replace loop. I don't see the sense of
    > writing a highly inefficient Replace method in
    > VBS when the scripting runtime can do it internally.
    >
    > * *But in general, why not tokenize? In compiled
    > code that should be by far the fastest, with much
    > greater speed achieved if the characters can be
    > treated as numbers in an array so that the operation
    > is not allocating new strings or deciphering the Chr
    > value of each stored numeric value of the string.
    > In VBS, I don't know whether treating characters as
    > numbers will help, since it's still a variant that has
    > to be "parsed". I haven't tested the possibilities.
    > But I'm using numeric conversion below. I figured that
    > it should be a little faster than having the function
    > need to do a string comparison. (In a Select Case
    > where the character is not an "illegal" there would be
    > 20-30 string comparisons happening if one uses the
    > string version.)
    >
    > * *Another adsvantage of tokenizing is flexibility.
    > There can be dozens of Case declares with very
    > little cost.
    >
    > ' Note: I just wrote this as an "air code" sample.
    > ' I didn't bother to get all of the ascii values since
    > ' it's just a demo.
    >
    > Function Clean(sIn)
    > *Dim i2, iChar, A1()
    >
    > *ReDim A1(len(sIn) - 1)
    > * * For i2 = 1 to Len(sIn)
    > * * * *iChar = Asc(Mid(sIn, i2, 1))
    > * * * Select Case iChar
    > * * * * Case 63, 47, 92, 58, 42, 60, 62, 44, 46, 43, 126
    > * * * * * *A1(i2 - 1) = "-"
    > * * * * Case Else
    > * * * * * A1(i2 - 1) = Chr(iChar)
    > * * * End Select
    > * * Next
    > * * * Clean = Join(A1, "")
    > End Function
    Hi Mayayana,

    As the "air code" sample of your method parses the string character by
    character, I suspect theat a combination of your method and the
    function provided should allow characters to be replaced, taking into
    account the context of each illegal character.

    I am using the method to clean a plain text string that may or may not
    contain URLs. If there are URLs present in the string, they are later
    replaced with an internal url with paramaters pointing to a logging
    script that loggs and forwards the request to the original url. The
    cleaned string is also used to generate a set of keywords and
    keyphrases from the text supplied.

    I have based the code below from the "air code" demo, which has also
    not been tested. I have incorporated the contextual tests to only
    remove/replace some characters if they are not in a scpecific context
    (using a URL as an example).

    The method below must certainly be a better approach to the function
    linked from this thread, or suggested by Al. What do you think? Also,
    is there a better way to incorporate the contextual tests for each
    illegal character the string?

    Thanks

    James

    -------------------------

    Function Clean(sIn)
    Dim i2, iChar, A1()

    ReDim A1(len(sIn) - 1)
    For i2 = 1 to Len(sIn)
    iChar = Asc(Mid(sIn, i2, 1))
    Select Case iChar
    Case 58
    rChars = Mid(sIn, i2+1, 2)
    If rChars = "//" Then
    A1(i2 - 1) = Chr(iChar)
    End If

    Case 47
    rChar = Asc(Mid(sIn, i2+1, 1))
    lChar = Asc(Mid(sIn, i2-1, 1))

    If rChar = 47 OR lChar = 47 Then
    A1(i2 - 1) = Chr(iChar)
    Else
    A1(i2 - 1) = "-"
    End If

    Case 63, 92, 42, 60, 62
    A1(i2 - 1) = "-"

    Case 44, 46, 43, 126
    A1(i2 - 1) = ""

    Case Else
    A1(i2 - 1) = Chr(iChar)
    End Select
    Next
    Clean = Join(A1, "")
    End Function

      My System SpecsSystem Spec

  9. #8


    mayayana Guest

    Re: VBScript String Replace - Remove / Replace Characters in String

    >
    The method below must certainly be a better approach to the function
    linked from this thread, or suggested by Al. What do you think? Also,
    is there a better way to incorporate the contextual tests for each
    illegal character the string?

    >
    I think that's pretty much what I meant in saying
    it's flexible. There's no limit, really. One could even
    call separate functions from within the Select Case.

    Parsing URLs
    sounds tricky, but it can be done. For instance, you
    could check each ":" to see if it's part of "http://",
    then get the whole URL and write your edited
    URL to the array. You'd just have to find the end
    of the URL, calculate the offset of the start and end
    characters, and keep track of how many characters
    you've actually written to the array. With edits involved
    you might need to use a bigger array and then Redim
    Preserve it at the end before the Join call.

    -------------------------

    Function Clean(sIn)
    Dim i2, iChar, A1()

    ReDim A1(len(sIn) - 1)
    For i2 = 1 to Len(sIn)
    iChar = Asc(Mid(sIn, i2, 1))
    Select Case iChar
    Case 58
    rChars = Mid(sIn, i2+1, 2)
    If rChars = "//" Then
    A1(i2 - 1) = Chr(iChar)
    End If

    Case 47
    rChar = Asc(Mid(sIn, i2+1, 1))
    lChar = Asc(Mid(sIn, i2-1, 1))

    If rChar = 47 OR lChar = 47 Then
    A1(i2 - 1) = Chr(iChar)
    Else
    A1(i2 - 1) = "-"
    End If

    Case 63, 92, 42, 60, 62
    A1(i2 - 1) = "-"

    Case 44, 46, 43, 126
    A1(i2 - 1) = ""

    Case Else
    A1(i2 - 1) = Chr(iChar)
    End Select
    Next
    Clean = Join(A1, "")
    End Function



      My System SpecsSystem Spec

  10. #9


    James Guest

    Re: VBScript String Replace - Remove / Replace Characters in String

    On Mar 5, 1:55*pm, "mayayana" <mayay...@newsgroup> wrote:

    > The method below must certainly be a better approach to the function
    > linked from this thread, or suggested by Al. What do you think? Also,
    > is there a better way to incorporate the contextual tests for each
    > illegal character the string?
    >
    >
    >
    > * I think that's pretty much what I meant in saying
    > it's flexible. There's no limit, really. One could even
    > call separate functions from within the Select Case.
    >
    > * Parsing URLs
    > sounds tricky, but it can be done. For instance, you
    > could check each ":" to see if it's part of "http://",
    > then get the whole URL and write your edited
    > URL to the array. You'd just have to find the end
    > of the URL, calculate the offset of the start and end
    > characters, and keep track of how many characters
    > you've actually written to the array. With edits involved
    > you might need to use a bigger array and then Redim
    > Preserve it at the end before the Join call.
    >
    > -------------------------
    >
    > Function Clean(sIn)
    > *Dim i2, iChar, A1()
    >
    > *ReDim A1(len(sIn) - 1)
    > * * For i2 = 1 to Len(sIn)
    > * * * *iChar = Asc(Mid(sIn, i2, 1))
    > * * * Select Case iChar
    > Case 58
    > rChars = Mid(sIn, i2+1, 2)
    > If rChars = "//" Then
    > A1(i2 - 1) = Chr(iChar)
    > End If
    >
    > Case 47
    > rChar = Asc(Mid(sIn, i2+1, 1))
    > lChar = Asc(Mid(sIn, i2-1, 1))
    >
    > If rChar = 47 OR lChar = 47 Then
    > A1(i2 - 1) = Chr(iChar)
    > Else
    > A1(i2 - 1) = "-"
    > End If
    >
    > Case 63, 92, 42, 60, 62
    > * *A1(i2 - 1) = "-"
    >
    > Case 44, 46, 43, 126
    > * *A1(i2 - 1) = ""
    >
    > * * * * Case Else
    > * * * * * A1(i2 - 1) = Chr(iChar)
    > * * * End Select
    > * * Next
    > * * * Clean = Join(A1, "")
    > End Function
    Thanks Mayayana,

    The illegal characters are being removed or replaced as expected. I
    am using a regular expression with the replace function to remove all
    html tags exept for "a" tags (hyperlinks). I am then removing all "a"
    tags so that only the href value is left, which is placed after the
    anchor text in brackets.

    The next step I am using the string clean function from the linked
    article (now modified to include suggestions in this thread) to remove
    all special characters from the string except when a component of a
    URL.

    The final step, which I am currently working on is to parse the
    cleaned string to replace urls with the internal redirect. It is
    working as expected, but there are some cases where URLs are not
    followed by a space depending on the context in the original string.
    The problem being that there isn't currently a consistent method to
    find the end of each URL. I am working toward adjusting the function
    so that all URLs are contained in square brackets [] once processed
    using the string clean function so that they can be found easily when
    parsing to update the URLs.

    I am replacing all special characters with a space, then re-parsing
    the string to remove double (or more) spaces between words / URLs.
    This works most of the time, but as i am not removing "." chars (ASCII
    # 46), a url may end up with an additional "." at the end (http://
    address.com.). To prevent this, i am replacing all "." with " ."
    before parsing URLs so allow URLS to be recognised consistently.
    After parsing and converting URLs, I then replace any occurrances of
    " ." with the original "."

    This seems to work, but I am not sure that it is the best way to do
    this as the same string is parsed a number of times before the desired
    results are achieved.

    The string clean function works well using the tokenizing method.
    Thanks again for your suggestion.

    James

      My System SpecsSystem Spec

  11. #10


    mayayana Guest

    Re: VBScript String Replace - Remove / Replace Characters in String

    >
    This seems to work, but I am not sure that it is the best way to do
    this as the same string is parsed a number of times before the desired
    results are achieved.

    >
    I think if it were me I'd put it *all* in the tokenizer.
    For instance, for "<" you could do something like:

    Case 60
    If ucase(Mid(sIn, i2 + 1, 1)) = "A" then
    'This is an anchor tag, so parse it.
    Else 'drop out all other tags.
    Do
    i2 = i2 + 1
    if Mid(sIn, i2, 1) = ">" then exit do
    Loop
    End If

    One note with that: You'd want to use Do/Loop
    for the main loop so that you can change the
    value of i2. The code above would go back to the
    start of the main loop and begin processing the next
    character after the end of the tag. My original code
    used: For i2 = ..... Next

    I guess it all gets down to a matter of personal
    preference at some point, though. You're the one
    who's going to have to maintain your script.







      My System SpecsSystem Spec

Page 1 of 2 12 LastLast

VBScript String Replace - Remove / Replace Characters in String
Similar Threads
Thread Forum
VBScript String Clean function - remove or replace a set ofcharacters VB Script
VBScript String Clean function - remove or replace illegal characters VB Script
Replace Nth character in a string VB Script
Re: String manipulation using replace etc. VB Script
search and replace string PowerShell