0

i'm trying to get the innerHTML out of two different ids to count words. Therefore i used querySelectorAll with two id matches. I only get the first match back. Is this method even possible?

function() {
    var wordCounts;
    var wordCountTemp = document.querySelectorAll("#text-block-10, #text-block-12");
    var i;
    for(i = 0; i < wordCountTemp.length; i++){
    wordCounts = wordCountTemp[i].innerHTML;
    wordCounts.replace(/(^\s*)|(\s*$)/gi,"");
    wordCounts.replace(/[ ]{2,}/gi," ");
    wordCounts.replace(/\n /,"\n");
    return wordCounts.split(" ").length;
    }
}

Thanks a lot for your help!

Best regards, Toni

Toni2708
  • 119
  • 1
  • 3
  • 10
  • This `wordCounts = wordCountTemp[i].innerHTML;` completely overwrites `wordCounts`. You would need to initialize to an empty string `var wordCounts = ""` and then use `+=` instead of `=`. Also, `.replace()` doesn't modify the string. It returns a new string with the replacement done, so you need to assign that to keep the result. Finally, having `return` inside the loop will guarantee that you *always* return on the first iteration. It should be after the loop. –  Aug 20 '16 at 01:06
  • What did you find when you stepped through your code using the debugger? –  Aug 20 '16 at 04:30
  • By the way, why are you trying to do this? –  Aug 20 '16 at 04:35
  • Hi @torazaburo I write the code as a javascript variable in Google Tag Manager thus I can test the code only in the preview and debug mode. This can be very annoying when testing every single step. Beside that I work with querySelectorAll the first time and didn't know if it is possible to search for more than one match. I'm trying figure which word length works best in blog posts in combination with google analytics content grouping. – Toni2708 Aug 20 '16 at 18:42

2 Answers2

2

You return from your function prior to doing anything with any element other than the first one returned from querySelectorAll. In addition, replace does not modify the string, it returns a new copy. Thus, the count you are returning is that of wordCountTemp[i].innerHTML.split(" ").length.

Your original code: (with comments)

function() {
    var wordCounts;
    var wordCountTemp = document.querySelectorAll("#text-block-10, #text-block-12");
    var i;
    for(i = 0; i < wordCountTemp.length; i++){
        wordCounts = wordCountTemp[i].innerHTML;
        wordCounts.replace(/(^\s*)|(\s*$)/gi,"");  //Has no effect
        wordCounts.replace(/[ ]{2,}/gi," ");       //Has no effect
        wordCounts.replace(/\n /,"\n");            //Has no effect

        //This next line returns your result in the first pass through the loop.
        //  Only the first element returned by querySelectorAll is acted upon.
        //  No other element is processed other than the first one.
        return wordCounts.split(" ").length; 
    }
}

Note: I am changing innerHTML to textContent. I'm assuming that you only want to count the words which are text (i.e. not HTML code, scripts, etc.). I also changed the variable name wordCountTemp to nodeList as that is more descriptive of what it is (it is, in fact, a NodeList)

To use a similar structure to what you are already using:

function countWords() {
    var wordCounts;
    var totalWordCount=0;
    var nodeList = document.querySelectorAll("#text-block-10, #text-block-12");
    for(var i = 0; i < nodeList.length; i++){
        wordCounts = nodeList[i].textContent;
        wordCounts = wordCounts.replace(/(^\s*)|(\s*$)/gi,"");
        wordCounts = wordCounts.replace(/[ ]{2,}/gi," ");
        wordCounts = wordCounts.replace(/\n /,"\n");
        totalWordCount += wordCounts.split(" ").length;
    }
    return totalWordCount; //return the total count after all passes through loop
}

Instead of assigning your result of each replace to wordCounts over and over again to progressively modify it, you could just directly act on the new string returned by replace:

function countWords() {
    var totalWordCount=0;
    var nodeList = document.querySelectorAll("#text-block-10, #text-block-12");
    for(var i = 0; i < nodeList.length; i++){
        totalWordCount += nodeList[i].textContent
                                     .replace(/(^\s*)|(\s*$)/gi,"")
                                     .replace(/[ ]{2,}/gi," ")
                                     .replace(/\n /,"\n")
                                     .split(" ").length;
    }
    return totalWordCount; //return the total count after all passes through loop
}

Using regular expressions is, relatively, expensive. not really that much, but there is no reason not to optimize in this situation. Instead of performing all of the replace functions for each element, it is more efficient to, within the loop, concatenate the strings returned by textContent. Then, after you have one big long string with all the text from all the elements, you can perform the replace and split actions once.

function countWords() {
    var allText='';
    var nodeList = document.querySelectorAll("#text-block-10, #text-block-12");
    for(var i = 0; i < nodeList.length; i++){
        allText += ' ' + nodeList[i].textContent;
    }
    //return the total count after getting the textContent from all elements
    return allText.replace(/(^\s*)|(\s*$)/gi,"")
                  .replace(/[ ]{2,}/gi," ")
                  .replace(/\n /,"\n")
                  .split(" ").length;
}

Note: All of the above assume that none of the elements returned by the querySelectorAll are children of other elements which are returned. If they are, you will be counting the same text twice.

Makyen
  • 31,849
  • 12
  • 86
  • 121
  • Hi @Makyen, i didn't expect such a deep explanation. Now I really understand the single steps of the code. With the new knowledge I'm trying to fix my issue. Thanks! – Toni2708 Aug 20 '16 at 18:52
  • Hi @Makyen, your code works perfectly! Do you know how I can tell the querySelector which pattern to choose? Like a regExp or rather Wildcard text-block-* ? Guess I have to loop the whole document an compare it with the ids in the array? Would really appreciate your help. Cheers Toni – Toni2708 Aug 20 '16 at 22:20
  • @Toni2708, I'm not sure what you mean by "which pattern to choose". The argument for [`querySelectorAll()`](https://developer.mozilla.org/en-US/docs/Web/API/Element/querySelectorAll) is a CSS selector ([MDN](https://developer.mozilla.org/en-US/docs/Web/Guide/CSS/Getting_Started/Selectors), [w3schools](http://www.w3schools.com/cssref/css_selectors.asp)). What you use depends on what you want to match. I don't know *exactly* what you want to mach, so I have a hard time suggesting something. However, `[id^=text-block]` would select all elements with an ID that starts with `text-block`. – Makyen Aug 20 '16 at 23:07
  • @Toni2708, If you are going to manually go through the entire document for text (not really suggested, unless that is really what you need), you might want to consider using a [TreeWalker](https://developer.mozilla.org/en-US/docs/Web/API/TreeWalker) to get all the text nodes in the ``. I have an example that uses a TreeWalker to modify all the text in all text nodes in the `` in [another answer](http://stackoverflow.com/a/37566073/3773011). – Makyen Aug 20 '16 at 23:42
  • This is exactly what I was searching for! Everything just works fine now. Thanks a lot @Makyen – Toni2708 Aug 20 '16 at 23:51
0

Try this:

Array.from(document.querySelectorAll("#text-block-10, #text-block-12"))
  .map(x => x.innerHTML
    .replace(/(^\s*)|(\s*$)/gi,"")
    .replace(/[ ]{2,}/gi," ")
    .replace(/\n /,"\n")
    .split(" ")
    .length)
  .reduce((a, b) => a + b)
Michał Perłakowski
  • 88,409
  • 26
  • 156
  • 177
  • And this fixes what problem in the OP's code, and works how? –  Aug 20 '16 at 04:29
  • @Toni2708, This is certainly more compact (elegant?). You should take a look at it and try to figure it out. However, it has significant issues with compatibility across all browsers. Compatibility/feature support is a general issue for writing JavaScript, particularly when using newer features of the language. You need to check the compatibility of each feature vs. the environment (browser and version) in which you expect it to run. Compatibility across browsers/versions may not matter to you, depending on your expected use of the code you are writing. (continued) – Makyen Aug 20 '16 at 19:50
  • (continued) @Toni2708, See the "Browser compatibility" section at the bottom of each of the following pages: [`Array.from`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/from), [`.map [Array]`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map), [`.reduce`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/Reduce), and [`=>` (Arrow functions)](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions). – Makyen Aug 20 '16 at 19:50
  • Hi @Makyen, how can I return value of the array to a function since it's not set to a variable? In Google Tag Manager I'm forced to put the code in a function () {code}. Thanks a lot! – Toni2708 Aug 20 '16 at 22:33
  • @Toni2708, I just saw this. I'm not sure what you're asking here. If what you are asking about in your above comment is still relevant, please elaborate. The code in this answer just calculates the number. You will need to add some additional code to do something with the number (e.g. `return` it from a function, assign it to a variable, etc.). – Makyen Aug 21 '16 at 02:03