<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <abstract>
        <p>Sociolinguistics has become an increasingly important and popular field of study, as certain cultures around the world expand their communication base and intergroup and interpersonal relations take on escalating significance. Language use symbolically represents fundamental dimensions of social behavior and human interaction. The creation of the competitiveness index started as mean to validate, using social data mining, the influence of sociolinguistic factors among a specific linguistic group. To prove this theoretical model, structural equation model was chosen because of its ability to isolate observational error from measurement of latent variables. We present a prototype Web based Decision Support System with data mining capabilities. The purpose of the presented system is to analyze differed social variables to determine specific indicators associated with European Parliament. Original code was developed in Java for an intelligent agent to monitor main changes in diverse societies using data from different databases about competiveness based on information obtained from diverse organizations websites on the Internet. We conducted an experiment using our prototype system applied to European societies and their public policies, a region with a high development during the last 50 years. Preliminary results show the system could be used to model competiveness based on historical information and to identify critical future scenarios. This system can serve as a base for the development of a prediction model. Using an analysis carried out in the translation of Natural Language Queries in Spanish to SQL involving the clause of grouping GROUP BY in Natural Language Interfaces to Databases (NLIBDs), the important role and the different ways to find them in the Natural Language. Key words: social data mining, sociolinguistics, and competitiveness</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>Web based DSS</kwd>
        <kwd>Decision Support</kwd>
        <kwd>Processing of Natural Language</kwd>
        <kwd>social data mining</kwd>
        <kwd>sociolinguistics</kwd>
        <kwd>and competitiveness</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>groups. The two main groups in the European Parliament (together own 61% of the seats) are the
European People's Party and the Group of the Alliance of Socialists and Democrats Progressive.
Since the founding of Parliament in 1952, its powers were extended several times, especially
through the Maastricht Treaty and the recent in 1992 the Lisbon Treaty in 2007.</p>
      <p>The European Parliament has two meeting places: The Louise Weiss building in Strasbourg,
France, in which twelve plenary sessions are held four days a year and is the official seat of
Parliament, and the complex of buildings of Space Léopold in Brussels, Belgium, which is the
larger of the two and serves for committee meetings, political groups and complementary plenary
sessions. The General Secretariat of the European Parliament for its part, the administrative body, is
based in Luxembourg.</p>
      <p>
        A part of the European parliament is chosen based on work proposals submitted from across
Europe, not just the 27 plus Croatia which begins in January 2013 to join the European Union.
Proposals will be ranked according to their total score and then vote for the other European
representatives. Many contests competing for a better place as Eurovision have been studied with
different perspectives: the compatibility between countries and political and cultural structures of
Europe [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], the persistent structure of hegemony in the Festival of the Eurovision Song Contest [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
voting culture [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and the analysis of the Grand Prix that evaluates many countries participating in
different years and with many different types of countries competing with each other [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], among
others. This research is novel because the type of behavior analysis when people from different
cultures proposals are analyzed and participate in this setting mechanism proposed for the whole
European continent. The objective is to estimate the final ranking of the proposals. The organization
of this paper is as follows. The analysis of the 30 calls for proposals for incorporating a priori
knowledge about voting patterns and relationships among potential winners is explained in Section
2. Then the problem statement is defined in Section 3. COPSO The algorithm is explained in detail
in Section 4. In Section 5, our approach has been tested in the Call for Proposals 2011, with
proposals from 43 countries including Israel. Experiments and analysis to estimate the classification
of a specific proposal in the competition for proposals are explained in Section 6. The conclusions
are set out in Section 7.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Analysis using Data Mining Proposals</title>
      <p>
        Data mining is the search for global patterns and relationships between data in huge databases, but
they are hidden within the vast amount of information stored in these repositories of information
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These relationships represent the knowledge of the value of the objects in the database. This
information is not necessarily a true copy of the information stored in the databases. Rather, it is
information that can be inferred from the database. One of the main problems in data mining is that
the number of possible relationships extracted is exponential [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Therefore, there are a variety of
machine learning heuristics have been proposed to the knowledge discovery in databases [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. One
of the most popular methods for representing the results of data mining is the use of decision trees.
A decision tree provides a method for recognizing a given case for a concept. It is a "divide and
conquer" strategy for the acquisition of the concept (example). Decision trees have been useful for a
variety of case studies in science and engineering, in our case we use data mining to characterize the
behavior of each country's historic election related to accepting proposals. Therefore, we selected
the companies that have participated and characterized their behavior based on their votes already
cast, allowing to describe both society and the individual's behavior. The purpose is to explain υij,
voting (ie, the number of points) issued by the country's society i ≠ L in evaluating the performance
of a public policy j ≠ L (i ≠ j, as a society can only vote for one specific proposal), where L is the
total number of entries submitted from across Europe. Regardless of any other characteristic, the
equation could be written simply vote
      </p>
      <p>υji uij = α ijυij + (1)
Where α ij is a parameter υij commitment and a random disturbance. If the exchange of vows was
"perfect", and any proposal to maintain its ability to receive votes α ij would equal 1. More
generally, this type of equation should contain variables k = {1, ..., K α representing characteristics
(feasibility, proper timing etc.) of a given i, and variables that represent the different attributes of
the i this proposal over its involvement in sending proposals to the European parliament.
(2)
where β and γ are parameters to be estimated. The party associated with the beta parameter is
related to the performance attributes of a proposal (A useful public policy for the Company). The
party associated with the gamma parameter is related to the performance of these proposals during
the assessments in the European parliament. One problem has to do with the fact of what you want
to calculate the part of the equation for the comment on the vote of a company i to j represents the
proposal for a specific country.</p>
      <p>This can be treated in several ways. First, and this is the easiest way, instead of using υij on the
right side, you can use the vote in the previous proposal evaluation, say υij-1, although one might
think that societies do not necessarily maintain its time commitment. An alternative is to use only
half of the observations along all editions assessments proposed in Europe, therefore, υij appears on
the right side of the equation is not used while the left side. The vote equation is estimated by linear
methods. The influence on the order in which they appear in the list of proposals often described.
The exogenous order in which proposals are made is included as a factor. Other variables include
(a) a factor of innovation for new proposals, this variable is set to 1 for the person who submits a
proposal from the same approach as a proposal for a similar country-, (b) the language in which the
proposals is presented, (c) interest nature of the proposal such as being of ecological, and (d)
whether the proposed uses specialized pubic policy.</p>
      <p>
        The last group of variables includes linguistic and cultural distances between voters and proposals,
and we can afford to dispense with the use of variables that characterize voters. Cultural differences
among societies are represented by the four dimensions studied in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. These studies have identified
the following four dimensions that explain the "cultural distance":
(A) Power Distance: Measures the extent to which the less powerful members of a society accept
that power is distributed unequally, but focuses on the degree of equality between individuals;
(B) Individualism: Measures the degree to which individuals in a society are integrated into groups,
but focuses on the degree of how a society reinforces individual or collective achievement and
interpersonal relationships;
(C) Masculinity: refers to the distribution of gender roles in society, but focuses on the degree a
society reinforces the traditional role of male labor male achievement, control and power;
(D) To avoid uncertainty: It is about a society's tolerance for uncertainty and ambiguity, and refers
to man's search for truth.
      </p>
      <p>Table 1 shows the correlations between the cultural and the native languages of the countries that
are present in our sample. Control of the uncertainty is related to three other variables, but
otherwise, the distances seem to pick a different scale of people's behavior. The configurations can
be generated metaphorically related to knowledge of the behavior of the community with respect to
an optimization problem (to make alliances for better classification). Columns (a) to (d) of Table 2
contains the results of an OLS estimation of equation 2.</p>
      <p>First, we note that the quality of the proposal always plays an important role; it should not be, of
course amazing. The logrolling is meaningful only in (a), which does not take into account cultural
and linguistic distances. Stop being so in all other equations once the distances linguistic and / or
cultural value are counted. It should be noted that, even if the coefficient is significantly different
from zero, its value is very small. The order of appearance plays no role, while among the other
variables, the only one with any influence is "This proposal is focused on the defense of human
rights of a minority." Although not all distance coefficients are significantly different from 0 at the
level of 5 percent probability, all negative signs collected (the greater the distance, the lower the
rating). Table 3 presents the expected rates of return for 2009. The rate of return attempts to predict
the range of the proposal through environment variables observed over the last 10 editions of
evaluation of proposals in Europe. In 2011, 47 companies participated thus was more complex to
obtain a second, opposing a proposal that won second place in 2002, when only 15 proposals were
received from companies. Obviously, it exists for all proposals historical information analysis is
performed to evaluate. Information obtained through data mining, denotes a similar behavior for
proposals Companies with similar characteristics (language, territorial expansion, religion, etc.).
Therefore, the historical performance of each proposal was calculated using multivariate analysis.
The parameters used by the model to calculate the rate of return are: β = 0.4 and γ = 0.6.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Defining the problem</title>
      <p>The aim of this study is to estimate the position range of a new proposal to others. This involves
estimating the final vote matrix, where each cell j, i represents the score is given to each proposal by
the company i j, ie υji. To achieve a fairly good prediction, the model should control the voting
behavior between the various companies' representatives in Strasbourg, and for them to take into
account the historical behavior reflecting cultural empathy, the commonality of the regions. The
estimated yield could guide the model towards an optimal configuration of voting according to the
current expectations of the experts.</p>
      <p>The next objectives have functions between these two important features of the evaluation of
proposals in the European parliament, voting behavior and the rate of return has been explained in
the previous section. Note that Equation 3 is part of the Equation 4.</p>
      <p>Maximize (3)
Subject to:
• The proposal j cannot be voted by the same company that proposes.
• Society j can only vote once per contestant i's proposal.
• Society j can only give a score only a proposal contender ka i.</p>
      <p>Where N is the number of companies who vote, C is the number of proposals, S is the number of
results available 12,10,8,7,6,5,4,3,2,1 S = {} and maxs = 12 is the maximum score. The first two
terms represent the performance of the final classification. In the first term of Equation 3, ij is the
probability that k score was given by a group of companies’ j for a given i. The probability that each
proposal can be calculated by observing the behavior of voting over the last 10 editions of proposal
evaluations Pubic Policy. The model explained in this section, involves solving a combinatorial
problem that attempts to estimate the final vote for each proposal. The optimization problem has
two parts. In part, the problem is to find the optimal combination that maximizes the sum of the
probabilities (first two terms in Equation 4). This means the total vote of the companies involved
(subject to the limitations mentioned) should allocate 10 different scores (S) for each proposal,
resulting 1.87E 14 possible combinations. In the second part, the sum total of the votes obtained by
each proposal are calculated. Turn sums of (If) are used to calculate the weighted sum presented in
Equation 3 (third term). This again involves finding the optimal combination of 1.87E +14 possible
solutions. The maximization of the two parts of the problem generates a compromise between
voting behavior and the rate of return. To solve the problem of power optimization, using a simple
and innovative PSO to solve constrained optimization problems that are detailed in the next section.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Constrained optimization through PSO</title>
      <p>
        Particle Swarm Optimization (PSO) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is an algorithm, which is inspired by the movement of a
flock of birds or a school of fish. A member of the herd, flock or shoal called "particle". In PSO, the
source of diversity, called variation comes from two sources. One of them is the difference between
the position of the particle and the gbest xt particle considered the best overall performance (best
solution found by the flock), and the other is the difference between the current position of the
particle and the comparative xt with the best performance of its historical value PBest (best solution
found by the particle). Although the variation provides the diversity that can be sustained only for a
limited number of generations due to convergence of the particles, so it is necessary to refine the
solution for improvement. The velocity equation combines the particle local information with
global information pack, as follows.
      </p>
      <p>A leader within the particles can be global or local whole flock for a small flock. The small flocks
have a structure which defines how the information is concentrated and then distributed among the
members. The organization of the flock affects the search capacity and convergence. The original
ring structure is implemented by a doubly linked list. COPSO ring uses an alternative
implementation of the single linked list. This structure improves the success of the experimental
results in a very significant factor.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Experimentation used by using a Model Validation</title>
      <p>
        For the function of the proposed model was used to estimate the final votes of a group of proposals
from across Europe. These proposals competed against each other in the category of proposal for
the improvement of the environment and were evaluated by the rest of the European countries
through their representatives in the European parliament. To estimate the matrix of the vote, 30 runs
of the experiment were conducted, and the group of finalists, we performed experiments design
according to the attributes of each proposal to obtain a better estimate of the final classification. In
each run, 350,000 function evaluations were performed. The average over the 30 runs was
calculated for each proposal. Then, the average ranking was obtained to determine the 24 best
entries will compete in the final ranking. Three measures are calculated from the 30 runs: mean,
median and interquartile range. The interquartile range has an amplitude of 50% of the full value of
the median (second quartile Q2), calculated over the lower quartile Q1 (first quartile) and upper (Q3
quartile third quartile). In descriptive statistics a quartile is any of the three values that divide the
sorted data set into four equal parts so that each part represents 1/4th of the sample population. The
difference between the upper and lower quartiles is known as the interquartile range. In Section 6,
the estimation of our approach to the analysis of proposed public policies presents on Europe again.
6. Experiments conducted in the evaluation of proposals in the European parliament.
In 2012, the European parliament received at least one proposal for each company (47 countries)
focused on 11 different categories from environment to transport and security regionally. The aim
of this experiment is to predict the final ranking of each proposed as can be seen in Figure 4. For
this experiment, 30 runs were conducted with 27,000 function evaluations. The top-10 of the 30
runs indicate that only proposals with the most votes in the aspects of practicality, feasibility and
financial evaluation were most attractive when being voted, this could understand the difficult
economic situation in all Europe since 2008. To estimate the final ranking of the proposals, 30 runs
were conducted with 27,000 evaluations of the objective function (which is seeking to optimize this
algorithm using hybrid). The average range of the median and interquartile range for the 30 runs
was also calculated. Experiments to correctly predict the final standings were based on an
orthogonal array.
7. Conclusions
The prediction of future events is a difficult task to perform, because it requires extensive
multivariable analysis, is also impossible in several thematic [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. There are several methods that
have been used as an auxiliary tool for building estimation models. In this paper, data mining and
evolutionary computation are combined to predict the behavior of an evaluation of public policies
by the European parliament and is very similar to what was proposed in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Our approach proposes
a model that includes two primary features: voting behavior and cultural characteristics [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The
model incorporates historical information on the allocation of votes, which societies has made over
earlier editions made. The model also includes information on the intrinsic characteristics of the
candidate representing each policy proposal itself, future work would be to analyze the proposals
from small companies as the Faroe Islands, Guernsey, Jersey, Liechtenstein, Gibraltar Kosovo or
who face problems of various kinds’ very different countries with more than 1 million inhabitants.
The OECD currently uses such innovative methods based on artificial intelligence to properly
characterize and evaluate the different views of different societies in the context of being able to
listen to all voices even exist for minorities, something similar could be used in Ciudad Juarez
where 37.14% of the population was not born in the state of Chihuahua, which reached 46.89% of
people in the city nation not based approach those two large minorities could improve participation.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ginsburgh</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Noury</surname>
          </string-name>
          .
          <article-title>Cultural voting: The Eurovision Song Contest</article-title>
          . http://ssrn.com/abstract=884379,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kennedy</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Eberhart</surname>
          </string-name>
          .
          <source>The Particle Swarm: Social Adaptation in Information-Processing Systems. McGrawHill</source>
          , London,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ochoa</surname>
          </string-name>
          et al.
          <article-title>Italianità: Discovering a Pygmalion effect on Italian Communities using data mining</article-title>
          .
          <source>In Proceedings of CORE'</source>
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rauhlen</surname>
          </string-name>
          .
          <source>Culture's Consequences. Beverly Hils</source>
          , Calif.: Sage,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Suaremi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. Hal</given-names>
            <surname>Shikari</surname>
          </string-name>
          &amp;
          <article-title>Shayera. Understand social groups using artificial intelligence techniques</article-title>
          .
          <source>In Proceedings of NDAM'</source>
          <year>2006</year>
          , Reykjavik, Iceland,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Yair. Join Unite</surname>
          </string-name>
          <article-title>Europe: The Cultural and political structures of Europe as relected in the Eurovision Song Contest</article-title>
          .
          <source>Social Netwroks</source>
          ,
          <volume>17</volume>
          (
          <issue>2</issue>
          ):
          <fpage>147</fpage>
          -
          <lpage>161</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Yair</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Maman</surname>
          </string-name>
          .
          <article-title>The persistent structure of hegemony in the Eurovision Song Contest</article-title>
          .
          <source>Sociological Acta</source>
          ,
          <volume>39</volume>
          :
          <fpage>309</fpage>
          -
          <lpage>325</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Zolezzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Dori</given-names>
            <surname>Aandraison</surname>
          </string-name>
          &amp; A.
          <string-name>
            <surname>Ochoa-Zezzatti</surname>
          </string-name>
          .
          <article-title>A model to explain the extinction of San Benedicto Rock Wren using Cultural Algorithms</article-title>
          .
          <source>In Proceedings of OCAAI'2007</source>
          . Baku, Azerbaijan,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Noudher</surname>
          </string-name>
          . Palestine in Eurovision.
          <source>Master Thesis</source>
          of Sociology of Islamic University of Gaza,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>