Web Analytics Made Easy -
StatCounter regex and hyphenated words - CodingForum


No announcement yet.

regex and hyphenated words

  • Filter
  • Time
  • Show
Clear All
new posts

  • regex and hyphenated words

    Hi guys,

    Been working on a bit of of regex to locate exact word matches but having some problems due to the lack of support look behind was wondering if anyone could help me nail the final issue in this expression:

    sTest = new RegExp(/\bfoo(?![-])\b/);
    In the following array it makes the following matches (in bold):

    {'[B]foo[/B]', 'bar [B]foo[/B]', '[B]foo[/B] bar', 'foo-bar', 'bar-[B]foo[/B]', 'foo_bar', 'bar_foo'}
    All are correct except the one with a hyphen before the "foo" - I don't want to be getting a match on "bar-foo" but everything I try seems to break it.

    Assistance much appreciated.

  • #2
    Perhaps I should add that I only want to check for hyphens that are treated as word breaks before my search phrase & not within it.

    i.e if my search phrase was "foo-bar" I would want to match "foo-bar" but not match "bar-foo-bar".

    Hope that makes sense.


    • #3
      This may move you forward.

      var a = "foo-bar";  
      var b = "bar-foo-bar";
      a = " " + a;   
      if(/\s(foo)\b/.test(a)) {
      alert ("Match")
      else {alert ("No Match")}
      b = " " + b;
      if(/\s(foo)\b/.test(b)) {
      alert ("Match");
      else {alert ("No Match")}

      He thought he saw a Coach-and-Four
      That stood beside his bed:
      He looked again, and found it was
      A Bear without a Head.
      "Poor thing," he said, "poor silly thing!
      It's waiting to be fed!"
      - Lewis Carroll

      All the code given in this post has been tested and is intended to address the question asked.
      Unless stated otherwise it is not just a demonstration.


      • #4
        Considering JS regexp doesn't support lookbehind, you can mimic that functionality by reversing both strings... i.e. both the string you're going to be looking through and the regexp string that needs to be matched... i.e:

        Would match rab-oof (foo-bar), but not rab-oof-rab (bar-foo-bar) For more information on achieving the reverse result and essentially mimic-ing lookbehinds in JS, have a look at this website, they have listed other possible ways of doing it... and their negatives:


        Hope this helps.
        The way to success is to assume that there are no impossible things. After all, if you think something is impossible, you will not even try to do it.

        How to ask smart questions?


        • #5
          Thanks guys, the reverse thing was interesting and I can see some good uses for it in other cases but it just felt like way too much of an overhead to check for one possible character combination. For my desired scenario it would have meant a reverse, another pass through a loop and a bit of array shuffling.

          Instead I ended up going for an if statement with a check for a match on ("-" + "foo"). It doesn't feel elegent but I think it's a less convoluted method in this particular scenario.

          Anyway for those interested here's the full working code now which I think makes a pretty robust get element by classname function.

          function fGetElements(sParent, sTagType, sTagClass) {
          	var oElements = sParent.getElementsByTagName(sTagType);
          	var nElementLength = oElements.length;
          	this.aMatchingElement = new Array();	
          	this.nMatchCount = 0;
          	var sExpression = "/\\b" + sTagClass + "(?![-])\\b/";
          	sExpression = eval(sExpression);
          	for (nCount = 0; nCount < nElementLength; nCount++) {
          		if (oElements[nCount].className.match(sExpression)) {
          			if (!oElements[nCount].className.match("-" + sTagClass)) {
          				this.aMatchingElement[this.nMatchCount] = oElements[nCount];
          Any comments/suggestions/criticisms welcome


          • #6
            Element classNames are space-seperated tokens -- you have a well defined boundary condition, and don't need a RegExp at all:
            // death to Hungarian Notation! :)
            tagClass = " "+tagClass+" ";
            if ( (" "+elements[count].className+" ").indexOf(tagClass) != -1 ){
            For when you do need a RegExp, read up on the syntax of
            new RegExp(expression, flags)
            eval is almost never the right answer.


            • #7
              Interesting, it hadn't occured to me that (" " + sTagClass + " ") would force the variable to match both "foo" and say "bar foo bar" - would expect it to only match the second. So thanks for that.

              So my adventure into RegExp isn't completely wasted can you give me some help on the syntax for creating the RegExp with a variable inside. From all the examples I've found:

              var sExpression = new RegExp("/\\b" + sTagClass + "(?![-])\\b/");
              Should work, but doesn't. Am I escaping wrong or something?


              • #8
                You were very close but you put forward slashes in the constructor. Here you go:-

                <script type = "text/javascript">
                var value = "Hello" 
                var testString = "Does this string contain Hello or not?" 
                var sExpression = new RegExp('\\b'+value+'\\b','i').test(testString); 
                alert (sExpression);
                Last edited by Philip M; Apr 14, 2009, 06:43 AM.

                All the code given in this post has been tested and is intended to address the question asked.
                Unless stated otherwise it is not just a demonstration.