This week, we’re staying with the idea of career choice but are going about as far as away as you can get from Holland’s career congruence and person-environment fit — so hold on.
In the 1988 film “Bull Durham,” aging minor league baseball catcher and slugger Crash Davis (Kevin Costner) complains to Annie Savoy (Susan Sarandon) about the inherent unfairness that she, rather than he or Ebby Calvin LaLoosh (Tim Robbins), gets to decide which of the two will receive her personal favors and coaching mentorship for the season. He asks her, “Why do you get to choose?… Why don’t I get to choose? Why doesn’t he get to choose?”
She replies, “Well, actually, nobody on this planet ever really chooses… I mean, it’s all a question of quantum physics, molecular attraction, and timing. Why, there are laws we don’t understand that bring us together and tear us apart.”
Organizational writer Gareth Morgan, in his Images of Organizations (Sage, 1997) explores the use of nine metaphors to examine ways of considering organizations. One of those metaphors, “flux and transformation” (see chapter nine) presents us with four “logics of change,” embracing all of the ideas to which Annie alluded — and much more.
Morgan’s second logic of change, “shifting `attractors;” the logic of chaos and complexity is particularly interesting. Though this book was written with regard to the relationship between organizations and their environments, it’s fun to layer some of these ideas onto individuals and their careers. As we discussed last week, the applicability of choice when considering careers is open to question. A great career fit based on congruence may or may not exist. If it does exist, it may be difficult to discover — or its competitive nature may exclude all but the most skilled and talented. It may be a career that’s gone in 20 or even 10 years, or it may require the careerist to play a role that doesn’t seem quite as attractive a few years down the road.
So, then where else might we look in making career choices?
Drawing from the theories that inform Morgan’s second logic of change, here are some ideas for you ponder.
Chaos theory posits competing attractors – i.e. circumstances or “contexts” that pull a non-linear system toward one situation or the other – for example, away from an existing context and into a new one. In order for the pull to resolve in favor of a new context, a system gets pushed far from its equilibrium into an “edge of chaos” situation, where “bifurcation points” (forks in the road) emerge. These bifurcation points represent different potentials. Inevitably, some sort of new order will emerge, though it cannot be predicted or imposed. Morgan advises that the implication for managers is to “shape and create `contexts’ in which appropriate forms of self-organization can occur.” New contexts, he continues, can be created by generating “new understandings of a situation or by engaging in new actions.” Further, in non-linear systems, it only takes very, very small changes at critical times to trigger “major transforming effects.” Anyone, he continues, who wishes to change the context in which he operates should search for “doable, high-leverage initiatives that can trigger a transition from one attractor to another.”
This is all very esoteric, but what it might really come down to for the individual is being on alert to recognize situations in one’s employment context where competing attractors have the potential to create “edge of chaos” situations. If there is a practical lesson here – other than continually scanning the horizon of one’s employment context – it might just be to think small instead of thinking big.
Here’s a personal example, which only in retrospect makes sense – as I certainly had no idea what I was doing at the time… When I was downsized (made redundant) in 1993, the company I worked for worked very hard to provide helpful support to those of us who had been displaced. It staffed and opened a full-time outplacement center, provided a generous severance package and gave us two weeks to vacate. I had planned to use the career center – but first, went around the building leaving handwritten notes on the doors and desks of people I knew, advising that I would be available to help with projects, if needed, until I figured out what I was going to do. (Broad-based work solicitation wasn’t permitted within the old context). Well, I only made it to the career center once — because that one small series of note-leaving acts resulted in a deluge of consulting work that launched a new career. The downsizing had created an “edge of chaos” situation that led to a new context – one in which my skills could now be used for the benefit of the organization. Through naïvete and uncertainty, I had somehow navigated a bifurcation point in a way that has worked out pretty well – at least so far. I’m a little embarrassed to be using this personal example because there was such an element of luck involved — and this good fortune is not something I take for granted.
Just please take the following away: If you and your career are verging on an edge of chaos situation, are there small actions that you can leverage into major transformations?
If anyone has thoughts or examples, please share.
Till next week. All my best,
Jan
Morgan, G. Images of Organization. (1997). Thousand Oaks, London, New Delhi, Sage.
On March 24 a FINDALL search in Google for keywords density optimization returned 240,000 documents. I found many of these documents belonging to search engine marketing and optimization (SEM, SEO) specialists. Some of them promote keyword density (KD) analysis tools while others talk about things like “right density weighting”, “excellent keyword density”, KD as a “concentration” or “strength” ratio and the like. Others even take KD for the weight of term i in document j, while others propose localized KD ranges for titles, descriptions, paragraphs, tables, links, urls, etc. One can even find some specialists going after the latest KD “trick” and claiming that optimizing KD values up to a certain range for a given search engine affects the way a search engine scores relevancy and ranks documents.
Given the fact that there are so many KD theories flying around, my good friend Mike Grehan approached me after the Jupitermedia’s 2005 Search Engine Strategies Conference held in New York and invited me to do something about it. I felt the “something” should be a balanced article mixed with a bit of IR, semantics and math elements but with no conclusion so readers could draw their own. So, here we go.
Background.
In the search engine marketing literature, keyword density is defined as
Equation 1
where tfi, j is the number of times term i appears in document j and l is the total number of terms in the document. Equation 1 is a legacy idea found intermingled in the old literature on readability theory, where word frequency ratios are calculated for passages and text windows – phrases, sentences, paragraphs or entire documents – and combined with other readability tests.
The notion of keyword density values predates all commercial search engines and the Internet and can hardly be considered an IR concept. What is worse, KD plays no role on how commercial search engines process text, index documents or assign weights to terms. Why then many optimizers still believe in KD values? The answer is simple: misinformation.
If two documents, D1 and D2, consist of 1000 terms (l = 1000) and repeat a term 20 times (tf = 20), then for both documents KD = 20/1000 = 0.020 (or 2%) for that term. Identical values are obtained if tf = 10 and l = 500.
Evidently, this overall ratio tells us nothing about:
1. the relative distance between keywords in documents (proximity)
2. where in a document the terms occur (distribution)
3. the co-citation frequency between terms (co-occurrence)
4. the main theme, topic, and sub-topics (on-topic issues) of the documents
Thus, KD is divorced from content quality, semantics and relevancy. Under these circumstances one can hardly talk about optimizing term weights for ranking purposes. Add to this copy style issues and you get a good idea of why this article’s title is The Keyword Density of Non-Sense.
The following five search engine implementations illustrate the point:
1. Linearization
2. Tokenization
3. Filtration
4. Stemming
5. Weighting
Linearization.
Linearization is the process of ignoring markup tags from a web document so its content is reinterpreted as a string of characters to be scored. This process is carried out tag-by-tag and as tags are declared and found in the source code. As illustrated in Figure 1, linearization affects the way search engines “see”, “read” and “judge” Web content –sort of speak. Here the content of a website is rendered using two nested html tables, each consisting of one large cell at the top and the common 3-column cell format. We assume that no other text and html tags are present in the source code. The numbers at the top-right corner of the cells indicate in which order a search engine finds and interprets the content of the cells.
The box at the bottom of Figure 1 illustrates how a search engine probably “sees”, “reads” and “interprets” the content of this document after linearization. Note the lack of coherence and theming. Two term sequences illustrate the point: “Find Information About Food on sale!” and “Clients Visit our Partners”. This state of the content is probably hidden from the untrained eyes of average users. Clearly, linearization has a detrimental effect on keyword positioning, proximity, distribution and on the effective content to be “judged” and scored. The effect worsens as more nested tables and html tags are used, to the point that after linearization content perceived as meritorious by a human can be interpreted as plain garbage by a search engine. Thus, computing localized KD values is a futile exercise.
Burning the Trees and Keyword Weight Fights.
In the best-case scenario, linearization shows whether words, phrases and passages end competing for relevancy in a distorted lexicographical tree. I call this phenomenon “burning the trees”. It is one of the most overlooked web design and optimization problems.
Constructing a lexicographical tree out of linearized content reveals the actual state and relationship between nouns, adjectives, verbs, and phrases as they are actually embedded in documents. It shows the effective data structure that is been used. In many cases, linearization identifies local document concepts (noun groups) and hidden grammatical patterns. Mandelbrot has used the patterned nature of languages observed in lexicographical trees to propose a measure he calls the “temperature of discourse”. He writes: “The `hotter’ the discourse, the higher the probability of use of rare words.” (1). However, from the semantics standpoint, word rarity is a context dependent state. Thus, in my view “burning the trees” is a natural consequence of misplacing terms.
In Fractals and Sentence Production, Chapter 9 of From Complexity to Creativity (2, 3), Ben Goertzel uses an L-System model to explain that the beginning of early childhood grammar is the two-word sentence in which the iterative pattern involving nouns (N) and verbs( V) is driven by a rule in which V is replaced by V N (V >> V N). This can be illustrated with the following two iteration stages:
0 N V (as in Stevie byebye)
1 N V N (as in Stevie byebye car)
Goertzel explains, “-The reason N V is a more natural combination is because it occurs at an earlier step in the derivation process.” (3). It is now comprehensible why many Web documents do not deliver any appealing message to search engines. After linearization, it can be realized that these may be “speaking” like babies. [By the way, L-System algorithms, named after A. Lindermayer, have been used for many years in the study of tree-like patterns (4)].
“Burning the trees” explains why repeating terms in a document, moving around on-page factors or invoking link strategies, not necessarily improves relevancy. In many instances one can get the opposite result. I recommend SEOs to start incorporating lexicographical/word pattern techniques, linearization strategies and local context analysis (LCA) into their optimization mix. (5)
In Figure 1, “burning the trees” was the result of improper positioning of text. However in many cases the effect is a byproduct of sloppy Web design, poor usability or of improper use of the HTML DOM structure (another kind of tree). This underscores an important W3C recommendation: that html tables should be use for presenting tabular data, not for designing Web documents. In most cases, professional web designers can do better by replacing tables with cascading style sheets (CSS).
“Burning the trees” often leads to another phenomenon I call “keyword weight fights”. It is a recurrent problem encountered during topic identification (topic spotting), text segmentation (based on topic changes) and on-topic analysis (6). Considering that co-occurrence patterns of words and word classes provide important information about how a language is used, misplaced keywords and text without clear topic transitions difficult the work of text summarization editors (humans or machine-based) that need to generate representative headings and outlines from documents.
Thus, the “fight” unnecessarily difficults topic disambiguation and the work of human abstractors that during document classification need to answer questions like “What is this document or passage about?”, “What is the theme or category of this document, section or paragraph?”, “How does this block of links relate to the content?”, etc.
While linearization renders localized KD values useless, document indexing makes a myth out of this metric. Let see why.
Tokenization, Filtration and Stemming
Document indexing is the process of transforming document text into a representation of text and consists of three steps: tokenization, filtration and stemming.
During tokenization terms are lowercased and punctuation removed. Rules must be in place so digits, hyphens and other symbols can be parsed properly. Tokenization is followed by filtration. During filtration commonly used terms and terms that do not add any semantic meaning (stopwords) are removed. In most IR systems survival terms are further reduced to common stems or roots. This is known as stemming. Thus, the initial content of length l is reduced to a list of terms (stems and words) of length l’ (i.e., l’ < l). These processes are described in Figure 2. Evidently, if linearization shows that you have already “burned the trees”, a search engine will be indexing just that.
Similar lists can be extracted from individual documents and merged to conform an index of terms. This index can be used for different purposes; for instance, to compute term weights and to represent documents and queries as term vectors in a term space.
Weighting.
The weight of a term in a document consists of three different types of term weighting: local, global, and normalization. The term weight is given by
Equation 2
where Li, j is the local weight for term i in document j, Gi is the global weight for term i and Nj is the normalization factor for document j. Local weights are functions of how many times each term occurs in a document, global weights are functions of how many times documents containing each term appears in the collection, and the normalization factor corrects for discrepancies in the lengths of the documents.
In the classic Term Vector Space model
Equation 3, 4 and 5
which reduces to the well-known tf*IDF weighting scheme
Equation 6
where log(D/di) is the Inverse Document Frequency (IDF), D is the number of documents in the collection (the database size) and di is the number of documents containing term i.
Equation 6 is just one of many term weighting schemes found in the term vector literature. Depending on how L, G and N are defined, different weighting schemes can be proposed for documents and queries.
KD values as estimators of term weights?
The only way that KD values could be taken for term weights
Equation 7
is if global weights are ignored and the normalization factor Nj is redefined in terms of document lengths
Equation 8
However, Gi = IDF = 1 constraints the collection size D to be equal to ten times the number of documents containing the term (D = 10*d) and Nj = 1/lj implies no stopword filtration. These conditions are not observed in commercial search systems.
Using a probabilistic term vector scheme in which IDF is defined as
Equation 9
does not help either since the condition Gi = IDF = 1 implies that D = 11*d. Additional unrrealistic constraints can be derived for other weighting schemes when Gi = 1.
To sum up, the assumption that KD values could be taken for estimates of term weights or that these values could be used for optimization purposes amounts to the Keyword Density of Non-Sense.
References
The Fractal Geometry of Nature, Benoit B. Mandelbrot, Chapter 38, W. H. Freeman, 1983.
A well known conference for an academic getaway is the Euroma conference. An annual event, where academics from around the world, including the UK, gather and discuss important matters around operations management. Included in this brain expanding event is a brain cell destroying gala dinner where some 500 or so Profs and Lecturers drink copious quantities of alcohol out of the sight of prying eyes of the university 🙂
Here is a short clip of the fantastic entertainment put on for the Gala dinner by our hosts. More on the actual content of the conference later.