Web Analytics Made Easy -
StatCounter SQL Full Text search - CodingForum

Announcement

Collapse
No announcement yet.

SQL Full Text search

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SQL Full Text search

    Hello. I have this table:

    Code:
    name        |   tags
    ------------+-----------------------
    960         |   css framework
    Blueprint   |   css framework
    jquery      |   javascript framework
    (P.S: The table has 3 more rows with the word framework on the tags field. I didn't insert them because i was too lazy to sort the table above )

    and execute this query:

    Code:
    SELECT name,tags,(MATCH (name,tags) AGAINST ('*css*' IN BOOLEAN MODE) + MATCH (name,tags) AGAINST ('*framework*' IN BOOLEAN MODE)) AS score FROM weds_scripts WHERE MATCH (name,tags) AGAINST ('*css* *framework*' IN BOOLEAN MODE) ORDER BY score
    The full text will ignore the css keyword and search only the framework keyword (css is NOT in Stopwords and NO its not too short because i tried to search the keyword j and it worked) so it gave me those results:

    Code:
    name        |   tags                 |  score
    ------------+------------------------+-------
    960         |   css framework        |  1
    Blueprint   |   css framework        |  1
    jquery      |   javascript framework |  1
    Does anyone know why it ignores the keyword: css?
    Thx

  • #2
    It will ignore words below four characters in length unless you have specifically changed your my.ini/my.cnf settings to allow otherwise.

    The other reason words would be ignored is if they would be found in 50% or more of the rows returned.

    Comment


    • #3
      1)
      NO its not too short because i tried to search the keyword j and it worked
      2)
      P.S: The table has 3 more rows with the word framework on the tags field
      So every row has the word framework on the tags field (if u check the table and the p.s) that means it is found on 100% of the rows returned... But it WON'T be ignored (unlike the css keyword which is present on LESS than 50% of the rows returned)!

      I already know those things...

      Comment


      • #4
        I double checked and the 50% matching rows criteria is not used in BOOLEAN mode, but the short words criteria is. Do you know if that value has been changed in the config file?
        Last edited by guelphdad; Sep 4, 2011, 03:47 AM.

        Comment


        • #5
          Code:
          CREATE TABLE scripts(
              id int unsigned not null primary key auto_increment,
              name text,
              tags text
          )
          ENGINE = MYISAM
          CHARSET utf8;
          
          ALTER TABLE scripts ADD FULLTEXT(name, tags);
          provide actual working example that a FULLTEXT index is applied and is finding terms one character in length and with words in more than 50% of the search terms.
          Code:
          SELECT name,tags, (MATCH(name,tags) AGAINST ('*j*' IN BOOLEAN MODE)) AS score FROM scripts WHERE MATCH(name,tags) AGAINST ('*j*' IN BOOLEAN MODE)
          j = 1 character long -> It finds terms (score = 1)

          Also if u use the table data i gave u above and instead of j u use framework (which is in EVERY row) it will still give u results
          Code:
          SELECT name,tags, (MATCH(name,tags) AGAINST ('*framework*' IN BOOLEAN MODE)) AS score FROM scripts WHERE MATCH(name,tags) AGAINST ('*framework*' IN BOOLEAN MODE)
          But if u use css as keyword it won't return anything!!!
          Code:
          SELECT name,tags, (MATCH(name,tags) AGAINST ('*css*' IN BOOLEAN MODE)) AS score FROM scripts WHERE MATCH(name,tags) AGAINST ('*css*' IN BOOLEAN MODE)

          Comment


          • #6
            About the short word criteria:
            Does it apply on the keyword's length (f.e: c)
            or
            the term's length found (f.e css)?

            Comment


            • #7
              Found the problem
              The min length is applied on the TERM FOUND not the keyword (in this case css which is less than 4 characters).... I changed a record tag from css framework to csss framework and then executed

              Code:
              SELECT name,tags, (MATCH(name,tags) AGAINST ('*css*' IN BOOLEAN MODE)) AS score FROM scripts WHERE MATCH(name,tags) AGAINST ('*css*' IN BOOLEAN MODE)
              and it found it

              Comment


              • #8
                In a BOOLEAN search * is a special character (a wild card if you will).

                using a BOOLEAN search with 'j' nothing will be found. Using it with 'j*' means any word beginning with 'j'. using it with '*j*' as you have means any word containing 'j'.

                That is why it will return rows for you. Because 'javascript' and 'jquery' both contain j followed by any number of characters (which is what the * is matching).

                however searching on '*css*' means any word containing css in it. So 'fixcss' or 'cssjbrd' would be returned because 'css' is contained in those made up words, but since css is a whole word in your case, and that whole word is less than four characters it won't be returned since it is smaller than the four characters required by the fulltext search.

                Comment

                Working...
                X