Web Analytics Made Easy -
StatCounter Special Character Formatting - CodingForum


No announcement yet.

Special Character Formatting

  • Filter
  • Time
  • Show
Clear All
new posts

  • Special Character Formatting

    Does anyone know how to strip Word or other word processing formatting from an HTML text area? I can use the Replace function for each ASCII character but there has to be an easier way.

  • #2
    Whenever I copy some text from HTML etc I usually paste it in NotePad and then copy again to clip board from notepad.

    This strips unwanted formatting for me.

    Hope this helps.
    House Of Proctor Genealogy


    • #3
      I actually do the same thing JoeP does... seems the quickest way to me without stripping stuff you don't want.

      Unless you need to replace the stuff when retrieving it from a file dynamically - in which case you will want to use the Replace() function. If the latter is the case, I agree with Dave - have any examples?

      Here's the basic idea though:

      myString = Replace(Replace(Replace(Replace(myString,"<",""),">",""),chr(34),""),chr(39),"")
      Which would replace <,>,",and ' with nothing in the above example.

      Last edited by whammy; Jun 17, 2002, 07:24 PM.
      Former ASP Forum Moderator - I'm back!

      If you can teach yourself how to learn, you can learn anything. ;)


      • #4
        Special Characters

        Trying to strip Word formatting from a cut and paste before it gets to the database. The formatting could be tabs, symbols, international alphabet characters, etc. Anything that could be cut and pasted into a text area from a word processor.


        • #5
          Instead of trying to identify all the unwanted characters, as in the above example using replace() identify the one's you do want instead. It's much easier to define what you want than trying to define all the other possible characters that you don't want.


          • #6
            Yeah... maybe instead of using Replace(), you could also just use a regular expression that contains the characters that are acceptable to you, and match the whole string against that, like:

            myRegExp = new RegExp

            With myRegExp
            .Pattern = "\w\s"
            .IgnoreCase = true
            .Global = True
            End with

            If myRegExp.test(MyString) = False Then
            myStringError = True
            End If

            I haven't tested that...

            Or, using another method (not NEARLY as elegant), you could make a string of characters that are acceptable, like:

            myAcceptableCharacters = ".|,|A|B|C|D|"


            And loop through the string you're checking to see if the current character is in the string (say using a variable like CurrentCharacter), like:

            If InStr(myAcceptableCharacters, CurrentCharacter) = False Then MyError = True
            Last edited by whammy; Jun 21, 2002, 07:32 PM.
            Former ASP Forum Moderator - I'm back!

            If you can teach yourself how to learn, you can learn anything. ;)


            • #7
              Heh... that's definitely typical "Word" HTML formatting. YECH.

              HTML TIDY (or the plugin HTML TIDY that comes with HTML KIT) claims to strip all of the "Word" formatting from a WORD-->HTML page, but from what I've seen it strips almost everything, lol.

              I'm not sure how to overcome the obstacle of someone potentially pasting "Word" characters in a textarea, without using a regular expression or function of some sort.

              You might be better off, if you're not comfortable using regular expressions, to let them know they need to paste from NotePad?
              Former ASP Forum Moderator - I'm back!

              If you can teach yourself how to learn, you can learn anything. ;)