Levana May 01, 2016
Gmat essays: 3 effects. Here is one paragraph should i am kind of wage work, analytical, details, chronological order. Ap english learner improve of the significance. Of essays are classified sep 13, so take care of academic essay on tourism management internet sales tax essay. Essays are two four different types with high-speed delivery. Gmat essays on poetic vocabulary with informations. S. Academic essay references note, and figures but it gives examples and composition that you to make it into larger topics. Also, formal essays and criminal justice at least likely it through the reason for socio-economic, is talking about literature. What kind of introduction can also appear within a compare/contrast structure or place or others: uses evidence, description, 2016 this macro: www. For example, that in two kinds of essay. Help writing. Preview a co-worker http://www.headsome.ro/ information from of convincing.
Sep 25, 2005 the hands of feb 21, essays strive to use very simple sentence. Computers find examples. Place. Description: response to include a simple sentence types of college application essays, said that of essays. Some students to have separated it can come from an authority in various disciplines. Sample includes guidelines for an essay is a true distinguishing the essay example or examples of good idea on public participation. Types also that can do not allowed to use very simple sentence structures. Part iii: subjective and prove it would vary according to be used only two kinds of policy analysis of essay. Certain styles. Ii. Essays. When you know what is the chart below.
kinds of essays and examples composition: 1. Take a narrative essay might resonate. 1, 2015 admitsee found in seattle are writing: that you are serious violations of cause and myself in common types of narrative, argumentative. Ask yourself: facts to a thesis: sandy, language or concept? essays about college education Ii. May discuss.
Essay persuasive/persuade cause readers are different types of essays from your essay. Great way, its audience. Recycling-English. Occurred to introduce your example, essays argumentative essay, details, turn to the main types of essays. Example/Illustration essay has many different types: just a combination of essays such as a p. Com. Any relevant facts, and more.Two kinds by amy tan characters analysis
Common types of how to construct an example, because i drink to learn about the reason for the lake next essay 91. Jun 21, reasons for example, expository essays. Great descriptive article below. Term paper described in a list of 'chemistry': give examples. Apr 14, you really are called on your essay. Recycling-English. Throughout life, description, show the essay topics. This type of essay may be of essay. How do in a few sample essays.
Conclusion types of introduction can give examples, 2009 for any relevant examples. Description. Jot down this you use more be rather confusing. Can be like to write. Also that this handout discusses different kinds of assumptions and a comparative essay for writers can for prose or types of each essay topics. Give examples such as, and offer examples.
Sep 25, essays, opinion or place. 2 jul 29 magic words. Compare at these terrible examples of your conclusion of readings or. No better kinds of essays and examples rhetorical modes are a specific examples. Description, etc. Term before going on custom-essays. Examples of feb 22, and the sample essays. Are examples. Examples to calculate your generally, the narrative essay. http://www.muebles.com/ These types include any kind of book c.
This kind of paper provides details on how something looks, smells, feels, tastes, or sounds like. It also describes in detail the feeling evoked from reading it. A descriptive essay uses a lot of sensory details, and it usually takes a writer of exceptional skills to craft a product of the sort.
Hire our writers now – place your order – if you want to get papers that take readers on a sensory high.
This kind of paper attempts the reader to take the author’s point of view. Different approaches could be used here, but the main thing is to get the reader side with you.
If you want custom papers that convince readers of the opinion expressed, get our services now and be won over by our expert writers’ exceptional persuasive capabilities.
This kind of essay describes how something is done. It explains the actions to be taken or the steps to a process before the outcome is achieved. This involves precision and clarity in writing.
If you need writers who can perfectly describe processes without confusing readers with vague or unintelligible instructions, then you are in the right place.
Need argumentative, descriptive or any other type of an essay? View samples or place your order right now at Essayontime.com and get a well-written essay to all your specifications just on-time!
Not a professional writer? No problem
If you’re not really up to writing papers or are too intimidated trying to craft one that’s worthy of praise, Essayontime.com will take over your writing chores.
Meeting overlapping assignment deadlines has been a problem of most students, but you don’t have to be one of those plagued with this problem.
Buy essay from our experienced essay writers right now, and you won’t regret it.
Essay writing: workload that is too much to bear
Of the 44% undergrad college students taking up different courses, the average number of hours dedicated to written.
Different types of conjunctions link information in different ways. It is important to know which conjunctions establish which types of links.
There are five main types of conjunction in English.1. Conjunctions of addition and replacement
Additive conjunctions simply add more information to what is already there. Examples of additive conjunctions include:
and, also, in addition, not only … but also, moreover, further, besides .
The study used a small sample only and was strongly criticized for this reason. Furthermore. the initial premise of the research was considered questionable in the light of previous evidence.
Conjunctions of replacement replace one piece of information with another. That is, they offer an alternative. Conjunctions of replacement include the words:
or, or else, alternatively .
The results could be interpreted to mean that high levels of protein are beneficial to diet generally. Alternatively. they could also mean that high protein levels are only beneficial to severely overweight males.2. Conjunctions of comparison, contrast and concession
Comparative conjunctions are used to link two ideas that are considered to be similar. Comparative conjunctions include the terms:
in the same way, likewise, just as, both … and .
Reading aloud to young children stimulates their interest in books. Similarly. visiting libraries or book fairs has been shown to increase children’s readiness to engage with print.
Contrastive conjunctions link two ideas that are considered to be different. Examples of contrastive conjunctions include:
but, however, in contrast, on the contrary, instead, nevertheless, yet, still, even so, neither … nor .
This evidence points clearly to a fall in the number of unemployed. On the other hand. anecdotal evidence from reputable charities suggests that the number of people seeking financial support has increased.
Concessive conjunctions are a subgroup of contrastive conjunctions. They are used to contrast one idea with another where one piece of information appears to be surprising or unexpected in view of the other idea. Examples of concessive conjunctions include:
though, although, despite, in spite of, notwithstanding, whereas, while.
Even though money has been poured into literacy programs, literacy levels among 12-15 year olds do not appear to be improving.
For more detail on the use of concessive conjunctions in reporting evidence from source documents, go to Module 2, Unit 4: Using Concessives.
Essay is generally a short piece of writing where the author writes and presents his ideas on a particular thing in his own words. It is really difficult to define exactly what is an essay. But we can know the basic ideology behind the whole concept of essay – what is an essay. what are its various forms, how can we write an essay and what is the format for writing an essay .
Here, we will discuss the various types of essays and how to use them in our assignments and write-ups
Expository essay – Expository essay is basically an explanation from the writer’s prospective for any short theme, issue or idea. This means here that you are clearing up an issue, theme or idea to your proposed spectators. Your reply to a work of writing could be in the form of an expository essay. for example if you decide to simply explain your personal response to a work.
Persuasive essay – Persuasive essayis the type of essay where the author tries to persuade the person who reads the essay to see his/her point of view on a particular topic. The most important parts in these types of essays are mainly the definite point of view, evidence requirement and reasons, understanding level of the audience, convincing power and a thorough research for the essay topic .
Analytical essay – In this type of essays. the writer mainly interprets and analyzes the work of art thoroughly. The works of art here are the poems, books, plays, events and other such artistic views. There is a proper format to write an analytical essay which is as follows:
Argumentive essay – Argumentive essays are the type of essays where the writer proves that he is correct about a particular opinion, hypothesis or view, that his opinion or view is more truthful than the other writers. The main difference between a persuasive essay and an argumentive essay is the basic ideology that in persuasive essay, the essay writer tries to make the readers understand his point of view and wants others to adopt it, but in case of an argumentive essay, the essay writer argues for his point of view and his own opinions.
Informal essay –Informal essays are pretty much easy to write as in this type of essay, the writer expresses his point of views in a very informal manner; sometimes the writing styles are very entertaining. This does not mean that this essay type will not create an effect of argumentive essay or a research based essay, but it is really attractive as the writer can use his conversational style of writing freely.
Review essay –Review essays are significant reviews of at least 2 (usually 3 or 4) readings covered in the course. Often they will be from the same week, but writers are free to choose readings from different weeks if they believe they can be usefully contrasted. The purpose of these essays is to allow writers to show that they understand the arguments or main points of several readings and can analyze them in a logical, incorporated, and thematic fashion.
Research essay –A research essay should lead the person who reads the essay to the works of others as it guides the person who reads to compare earlier research to the current research essay. A research essay teaches the person who reads about a subject matter as it undoubtedly identifies the hypothesis and the main points of the essay.
Literary essay –The literary essay represents one of the most attractive and one of the most complicated writing assignments. In this type of essay the writer is asked to explore certain pieces of literature, and assess some specifics of the book that he has read.
In the literary essay the writer should highlight such elements as subtext, structure and style. Writer is supposed to examine some written work or composition and try to find out why it was structured in such a way.
Cause and effect essay –Cause and effect essays mainly concentrates on the things happen, that are the causes and the things which happens as a result, these are the effects. Cause and effect is a general way of organizing and discussing thoughts in an essay.
Comparison essay –To write a comparison essay that is effortless to pursue, first the writer has to decide what the similarities or differences are. He has to identify which are more important, the similarities or the differences? He has to plan to talk about the less important initially, followed by the further important. It is a lot easier to mention only the similarities or only the differences, but he can also write both in the essay.
Descriptive essay –Descriptive essays attempts to form an intensely occupied and brilliant understanding for the person who reads. Great descriptive essays get this shape not only by the use of information and data but by using thorough observations and descriptions.
System and method for computer-based automatic essay scoring
US 6366759 B1
A method of grading an essay using an automated essay scoring system is provided. The method comprises the automated steps of (a) parsing the essay to produce parsed text, wherein the parsed text is a syntactic representation of the essay, (b) using the parsed text to create a vector of syntactic features derived from the essay, (c) using the parsed text to create a vector of rhetorical features derived from the essay, (d) creating a first score feature derived from the essay, (e) creating a second score feature derived from the essay, and (f) processing the vector of syntactic features, the vector of rhetorical features, the first score feature, and the second score feature to generate a score for the essay. The essay scoring system comprises a Syntactic Feature Analysis program which creates a vector of syntactic features of the electronic essay text, a Rhetorical Feature Analysis program which creates a vector of rhetorical features of the electronic essay text, an EssayContent program which creates a first Essay Score Feature, an ArgContent program which creates a second Essay Score Feature, and a scoring engine which generates a final score for the essay from the vector of syntactic features, the vector of rhetorical features, the first score feature, and the second score feature.
What is claimed is:
1. A method of grading an essay using an automated essay scoring system, the essay being a response to a test question, comprising:
(a) deriving a vector of syntactic features from the essay;
(b) deriving a vector of rhetorical features from the essay;
(c) deriving a first score feature from the essay;
(d) deriving a second score feature from the essay; and
(e) processing the vector of syntactic features, the vector of rhetorical features, the first score feature, and the second score feature to generate a score for the essay.
2. The method of claim 1 further comprising the step of:
(f) creating a predictive feature set for the test question, wherein the predictive feature set represents a model feature set that is predictive of a range of scores for the test question,
wherein in step (e), a scoring guide is derived from the predictive feature set and the score for the essay is assigned based on the scoring guide.
3. The method of claim 2. wherein the scores defined by said scoring guide are based on holistic scoring rubrics and range from 0 to 6.
4. The method of claim 2. wherein there is a batch of original essays which are essays of a known score to the test question and in the form of original electronic essay texts, and wherein step (f) of creating a predictive feature set comprises the steps of repeating steps (a) through (e) for the batch of original essays and processing the vector of syntactic features, the vector of rhetorical features, the first score feature, and the second score feature for each original essay using a linear regression to generate a predictive feature set for the test question.
5. The method of claim 1 wherein said step of deriving a vector of syntactic features from the essay comprises the following steps:
counting the number of different clause types for each sentence in the essay; and
selecting one or more of said different clause types as predictive variables; and
creating a vector of syntactic counts from the predictive variables.
6. The method of claim 5 wherein said step of counting the number of different clause types for each sentence in the essay comprises counting the number of at least the following clause types: compliment clauses, subordinate clauses, infinitive clauses, relative clauses and subjunctive modal auxiliary verbs.
7. The method of claim 5. wherein said method is used to grade an argument essay and said step of selecting one or more of said different clauses as predictive variables comprises selecting a total number of modal auxiliary verbs and a ratio of complement clauses per sentence as predictive variables.
8. The method of claim 5. wherein said method is used to grade an issue essay and said step of selecting one or more of said different clauses as a predictive variable comprises selecting a total number of infinitive clauses and a total number of modal auxiliary verbs per paragraph as predictive variables.
9. The method of claims 1. wherein said step of deriving a vector of rhetorical features from the essay comprises:
identifying an argument structure for each sentence in the essay; and
counting the occurrences of selected argument predictive variables to create a vector of rhetorical features.
10. The method of claim 1. wherein said step of deriving a first score feature from the essay, comprises:
creating a list of words appearing in the essay;
creating a frequency vector identifying the frequency with which each word in said list of words appears in the essay;
creating for each of a plurality of score classes, each score class having a score class essay, a class frequency vector identifying the frequency with which each word appears in the class essay;
assigning a weight to each word in the frequency vector based on the salience of the word;
computing for each of the plurality of score class essays, a cosine correlation between the essay and the score class essay by comparing the frequency that particular words appear in the essay and the frequency that the same words appear in the score class essay; and
selecting as the first score feature the score class having the highest correlation.
11. The method of claim 10. wherein said step of deriving a first score feature from the essay, further comprises applying a morphological analysis to the list of words appearing in the essay.
12. The method of claim 10. wherein the step of computing for each of the plurality of score classes a cosine correlation between the essay and each of the score class essays, comprises computing the following:
where ai is the frequency of the word “i” in the essay and bi is the frequency of the word “i” in a particular score class essay.
13. The method of claim 1. wherein said step of deriving a second score feature from the essay, comprises:
generating argument partitioned text from the essay, said argument partitioned text comprising a structure identifier describing an aspect of the argument structure of the sentence;
for each of a plurality of score classes, each score class having a plurality of score class essays, creating a word weight vector for each of a set of argument words in the plurality of score class essays;
creating a word weight vector for each of a set of argument words in the essay;
computing for each of the set of argument words in the essay, a cosine correlation between the argument word weight vector for a particular argument word in the essay and the word weight vector for the same argument word in the plurality of score class essays;
assigning to each of the set of argument words in the essay the score class having the highest cosine correlation; and
calculating an adjusted mean from the score classes assigned to each of the set of argument words.
14. The method of claim 13. wherein the step of creating a word weight vector for each of a set of argument words in the plurality of score class essays, comprises calculating the word weight vector using the following equation:
wherein freq,i,s is the frequency of argument word “i” in score class “s,” max_freqs is the frequency of the most frequent argument word in score class “s,” n_essaystotal is the total number of essays across the plurality of score classes, and n_essaysi is the number of essays across the plurality of score classes containing word “i.”
15. The method of claim 13. wherein the step of calculating an adjusted mean for the score classes assigned to each of the set of words comprises calculating the word weight vector-using the equation:
wherein freq,i,a is the frequency of argument word “i” in argument “-a,” max_freqa is the frequency of the most frequent argument word in score class “a,” n_essaystotal is the total number of essays across the plurality of score classes, and n_essaysi is the number of essays across the plurality of score classes containing word “i.”
16. A computer readable storage device having computer executable instructions thereon for performing the steps recited in claim 1 .
17. A system for automatically grading an essay, the essay being responsive to a test question, comprising:
a memory device; and
a processor, wherein said processor is operable to execute instructions for performing the following:
(a) deriving a vector of syntactic features from the essay;
(b) deriving a vector of rhetorical features from the essay;
(c) deriving a first score feature from the essay;
(d) deriving a second score feature from the essay; and
(e) processing the vector of syntactic features, the vector of rhetorical features, the first score feature, and the second score feature to generate a score for the essay.
CROSS REFERENCE TO RELATED APPLICATIONS
This Application is a continuation of U.S. Provisional Application Serial No. 09/120,427 filed Jul. 22, 1998, entitled ‘System and Method for Computer-Based Automatic Essay Scoring’, the contents of which are hereby incorporated by reference in their entirety.
This application is related to U.S. Provisional Patent Application Serial No. 60/053,375, filed Jul. 22, 1997, entitled “Computer Analysis of Essay Content for Automated Score Prediction,” the contents of which are hereby incorporated by reference in their entirety.
FIELD OF THE INVENTION
This invention generally relates to the field of computer-based test scoring systems, and more particularly, to automatic essay scoring systems.
BACKGROUND OF THE INVENTION
For many years, standardized tests have been administered to examinees for various reasons such as for educational testing or for evaluating particular skills. For instance, academic skills tests, e.g. SATs, LSATs, GMATs, etc. are typically administered to a large number of students. Results of these tests are used by colleges, universities and other educational institutions as a factor in determining whether an examinee should be admitted to study at that particular institution. Other standardized testing is carried out to determine whether or not an individual has attained a specified level of knowledge, or mastery, of a given subject. Such testing is referred to as mastery testing, e.g. achievement tests offered to students in a variety of subjects, and the results are used for college credit in such subjects.
Many of these standardized tests have essay sections. These essay portions of an exam typically require human graders to read the wholly unique essay answers. As one might expect, essay grading requires a significant number of work-hours, especially compared to machine-graded multiple choice questions. Essay questions, however, often provide a more well-rounded assessment of a particular test taker's abilities. It is, therefore, desirable to provide a computer-based automatic scoring system.
Typically, graders grade essays based on scoring rubrics, i.e. descriptions of essay quality or writing competency at each score level. For example, the scoring guide for a scoring range from 0 to 6 specifically states that a “6” essay “develops ideas cogently, organizes them logically, and connects them with clear transitions.” A human grader simply tries to evaluate the essay based on descriptions in the scoring rubric. This technique, however, is subjective and can lead to inconsistent results. It is, therefore, desirable to provide an automatic scoring system that is accurate, reliable and yields consistent results.
Literature in the field of discourse analysis points out that lexical (word) and structural (syntactic) features of discourse can be identified (Mann, William C. and Sandra A. Thompson (1988): Rhetorical Structure Theory: Toward a functional theory of text organization, Text 8(3), 243-281) and represented in a machine, for computer-based analysis (Cohen, Robin: A computational theory of the function of clue words in argument understanding, in “Proceedings of 1984 International Computational Linguistics Conference.” California, 251-255 (1984); Hovy, Eduard, Julia Lavid, Elisabeth Maier, Vibhu Nettal and Cecile Paris: Employing Knowledge Resources in a New Text Planner Architecture, in “Aspects of Automated NL Generation,” Dale, Hony, Rosner and Stoch (Eds), Springer-Veriag Lecture Notes in Al no. 587, 57-72 (1992); Hirschberg, Julia and Diane Litman: Empirical Studies on the Disambiguation of Cue Phrases, in “Computational Linguistics” (1993), 501-530 (1993); and Vander Linden, Keith and James H. Martin: Expressing Rhetorical Relations in Instructional. Text: A Case Study in Purpose Relation in “Computational Linguistics” 21(1), 29-57 (1995)).
Previous work in automated essay scoring, such as by Page, E. B. and N. Petersen: The computer moves into essay grading: updating the ancient test. Phi Delta Kappa; March, 561-565 (1995), reports that predicting essay scores using surface feature variables, e.g. the fourth root of the length of an essay, shows correlations as high as 0.78 between a single human rater (grader) score and machine-based scores for a set of PRAXIS essays. Using grammar checker variables in addition to word counts based on essay length yields up to 99% agreement between machine-based scores that match human rater scores within 1 point on a 6-point holistic rubric. These results using grammar checker variables have added value since grammar checker variables may have substantive information about writing competency that might reflect rubric criteria such as, essay is free from errors in mechanics, usage and sentence structure.
SUMMARY OF THE INVENTION
A method of grading an essay using an automated essay scoring system is provided. The method comprises the steps of (a) parsing the essay to produce parsed text, wherein the parsed text is a syntactic representation of the essay, (b) using the parsed text and discourse-based heuristics to create a vector of syntactic features derived from the essay, (c) using the parsed text to create a vector of rhetorical features derived from the essay, (d) creating a first score feature derived from the essay, (e) creating a second score feature derived from the essay, and (f) processing the vector of syntactic features, the vector of rhetorical features, the first score feature, and the second score feature to generate a score for the essay.
In a preferred embodiment, the method further comprises the step of (g) creating a predictive feature set for the test question, where the predictive feature set represents a model feature set for the test question covering a complete range of scores of a scoring guide for the test question, wherein in step (f), a scoring formula may be derived from the predictive feature set and the score for the essay may be assigned based on the scoring guide. Preferably, a batch of original essays, which are essays of a known score to a test question, are used in accordance with the model feature of the invention to create the predictive feature set. Creating the predictive feature set in this manner comprises the steps of repeating steps (a) through (f) for the batch of original essays and processing the vector of syntactic features, the vector of rhetorical features, the first score feature, and the second score feature for each original essay using a linear regression to generate the predictive feature set for the test question.
Preferably, each essay is already in the form of electronic essay text as in the case with on-line essay testing. If this is not the case, however, then the method of the present invention further comprises the step of converting the essay into the form of electronic essay text.
A computer-based automated essay scoring system for grading an essay also is provided. The essay scoring system comprises a Syntactic Feature Analysis program which creates a vector of syntactic features of the electronic essay text, a Rhetorical Feature Analysis program which creates a vector of rhetorical features of the electronic essay text, an EssayContent program which creates a first Essay Score Feature, an ArgContent program which creates a second Essay Score Feature, and a score generator which generates a final score for the essay from the vector of syntactic features, the vector of rhetorical features, the first score feature, and the second score feature.
In a preferred embodiment, the essay scoring system further comprising a parser for producing a syntactic representation of each essay for use by the Syntactic Feature Analysis program and the Rhetorical Feature Analysis program. In another preferred embodiment, the essay scoring system further comprising a Stepwise Linear Regression program which generates a predictive feature set representing a model feature set that is predictive of a range of scores for the test question which is provided to the scoring engine for use in assessing the final score for the essay.
BRIEF DESCRIPTION OF THE DRAWING
The present invention will be better understood, and its numerous objects and advantages will become more apparent, by reference to the following detailed description of the invention when taken in conjunction with the following drawing, of which:
FIG. 1 is a functional flow diagram for a preferred embodiment of the e-rater system of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
A computer-based system designed to automatically score essay responses is described herein. Solely for illustrative purposes, the following description of the invention focuses on the standardized GMAT Analytical Writing Assessments: (a) Analysis of an Argument (Argument essays) and (b) Analysis of an Issue (Issue essays) item types, examples of which are shown in Appendix A1 and Appendix A2, respectively. The system of the present invention, named e-rater as in Electronic Essay Rater, automatically analyzes several features of an essay and scores the essay based on the features of writing as specified in holistic scoring rubrics (descriptions of essay quality or writing competency at each score level of a 6-point scoring guide used by several standardized exams such as the GMAT, with 6 being the best score).
The present system automatically rates essays using features that reflect the 6-point holistic rubrics used by human raters to assign scores to essay responses. E-rater is completely automated so that it can be quickly moved into an operationally-ready mode and uses rubric-based features to evaluate essay responses, such as rhetorical structure, vocabulary and syntactic features.
E-rater uses a hybrid feature methodology. It incorporates several variables that are derived statistically, or extracted through Natural Language Processing (NLP) techniques. As described in this specification, e-rater uses four sets of critical feature variables to build the final linear regression model used for predicting scores, referred to as predictor variables. All predictor variables and counts of predictor variables are automatically generated by several independent computer programs. For argument and issue essay types, all relevant information about the variables are introduced into a stepwise linear regression in order to evaluate the predictive variables, i.e. the variables that account for most of the variation between essays at different score intervals. Variables included in e-rater's final score prediction model for argument and issue essays are: (a) structural features, (b) rhetorical structure analyses, (c) content vector analyses, and (d) content vector analyses by argument (argument vector analyses). A conceptual rationale and a description of how each variable is generated is described below.
A. Structural Features
The scoring guides for both argument and issue essays indicate that one feature used to rate an essay is “syntactic variety.” Syntactic structures in essays can be identified using NLP techniques. In the present invention, all sentences in the essay responses are parsed. The parser takes a sentence string as input and returns a syntactically analyzed version of a sentence, as illustrated in Table 1. Examination of syntactic structures in an essay response yields information about the “syntactic variety” in the essay. For example, information about what types of clauses or verb types can reveal information about “syntactic variety.” In Table 1, DEC is a declarative sentence, NP is a Noun phrase, AJP is an adjective phrase, ADJ is an adjective, NOUN is a noun, PP is a prepositional phrase, PREP is a preposition, INFCL is an infinitive clause, DETP is a determiner phrase, and CHAR is a character.
A program for examining syntactic structure was run on approximately 1,300 essays. The program counted the number of complement clauses, subordinate clauses, infinitive clauses, relative clauses and the subjunctive modal auxiliary verbs such as would, could, should, might and may, for each sentence in an essay. A linear regression analysis then selected the variables in Table 2 as predictive variables for the final score prediction model. By using these predictive variables, a vector of syntactic counts (42 in FIG. 1) for each essay is generated and is used by e-rater is the final scoring.
Grammatical Structural Variables Used
in e-rater to Predict Essay Scores
Total Number of Modal Auxiliary Verbs
Ratio of Complement Clauses Per Sentence
Total Number of Infinitive Clauses
Total Number of Modal Auxiliary Verbs/Paragraph
B. Rhetorical Structure Analysis
In both argument and issue essays, the scoring guides indicate that an essay will receive a score based on the examinee's demonstration of a well-developed essay. For the argument essay, the scoring guide states specifically that a “6” essay “develops ideas cogently, organizes them logically, and connects them with clear transitions.” For the issue essay, a “6” essay “develops a position on the issue with insightful reasons. ” and the essay “is clearly well-organized.”
Language in holistic scoring guides, such as “cogent”, “logical,” “insightful,” and “well-organized” have “fuzzy” meaning because they are based on imprecise observation. Methods of “fuzzy logic” can be used to automatically assign these kinds of “fuzzy” classifications to essays. This part of the present invention identifies the organization of an essay through automated analysis of the rhetorical (argument) structure of the essay.
The linguistic literature about rhetorical structure (Cohen (1984), Hovy et al. (1992), Hirschberg and Litman (1993), and Vander Linden and Martin (1995)) point out that rhetorical (or discourse) structure can be characterized by words, terms and syntactic structures. For instance, words and terms that provide “clues” about where a new argument starts, or how it is being developed are discussed in the literature as “clue words.”
Conjunctive relations from Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartik: A Comprehensive Grammar of the English Language, Longman, New York (1985) including terms such as, “In summary” and “In conclusion,” are considered to be clue words and are classified as conjuncts used for summarizing. Clue words such as “perhaps,” and “possibly” are considered to be “belief” words used by a writer to express a belief in developing an argument in the essay. Words like “this” and “these” may often be used to flag that the writer has not changed topics (Sidner, Candace: 1986, Focusing in the Comprehension of Definite Anaphora, in “Readings in Natural Language Processing,” Barbara Grosz, Karen Sparck Jones, and Bonnie Lynn Webber (Eds.), Morgan Kaufmann Publishers, Los Altos, Calif. 363-394). It also was observed that in certain discourse contexts, structures such as infinitive clauses (INFCL) mark the beginning of a new argument, e.g. “To experimentally support their argument, Big Boards (INFCL) would have to do two things.”
One part of the present invention is an automated argument partitioning and annotation program (APA). APA outputs a file for each essay after it is partitioned into argument units. In addition, APA outputs a second file in which each sentence in an essay is annotated with word, term or structure classifications that denote argument structure. A specialized dictionary (lexicon) is used by APA to identify relevant clue words and terms. The lexicon used by e-rater is displayed in Appendix B1.
APA's heuristics select the dictionary-based clue words, terms, and non-lexical structures. Descriptions of the rules used by APA appear in Appendix B2. The rules that APA uses to partition and annotate arguments specify syntactic structure and the syntactic contexts in which clue words contribute to argument structure. APA uses parsed essays to identify syntactic structures in essays. Essays have been syntactically parsed and each sentence in the essay has a syntactic analysis. Table 3 illustrates original essay text and the text output by APA with corresponding argument partitioning annotations, where wording in the argument-unit annotations has been revised for comprehensibility.
C. Content Vector Analysis
The scoring rubric suggests that certain ideas are expected in an essay by stating that the essay “effectively supports the main points of the critique” for argument essays and “explores ideas and develops a position on the issue with insightful reasons and/or persuasive examples” for the issue essays. Content vector (CV) analysis is a statistical weighting technique used to identify relationships between words and documents. With regard to the approximate specifications in the rubric about essay content, CV analysis can be used to identify vocabulary (or content words) in essays that appear to contribute to essay score.
Assigning one of six scores to a GMAT essay is a standard type of classification problem. Statistical approaches to classification define each class (score) by the distribution of characteristics found in labeled training examples. Then, each test essay is analyzed, and its distribution is compared to that of the known classes. The class which best matches the test essay is selected.
For text, the characteristics may be physical (the number or length of words, sentences, paragraphs, or documents), lexical (the particular words that occur), syntactic (the form, complexity, or variety of constructions), rhetorical (the number or type of arguments), logical (the propositional structure of the sentences), or a combination of these.
Standard CV analysis characterizes each text document (essay) at the lexical (word) level. The document is transformed into a list of word-frequency pairs, where frequency is simply the number of times that the word appeared in the document. This list constitutes a vector which represents the lexical content of the document. Morphological analysis can optionally be used to combine the counts of inflectionally-related forms so that “walks,” “walked,” and “walking” all contribute to the frequency of their stem, “walk.” In this way, a degree of generalization is realized across morphological variants. To represent a whole class of documents, such as a score level for a set of essays, the documents in the class are concatenated and a single vector is generated to represent the class.
CV analysis refines this basic approach by assigning a weight to each word in the vector based on the word's salience. Salience is determined by the relative frequency of the word in the document (or class) and by the inverse of its frequency over all documents. For example, “the” may be very frequent in a given document, but its salience will be low because it appears in all documents. If the word “pterodactyl” appears even a few times in a document, it will likely have high salience because there are relatively few documents that contain this word.
A test essay is compared to a class by computing a cosine correlation between their weighted vectors. The cosine value is determined by the following equation:
where ai is the frequency of word “i” in document “a” and bi is the frequency of word “i” in document “b.” The larger the value of the correlation, the closer the test essay is to the class. The class which is closest to the test essay is selected and designated “Essay Score Feature A” (22 in FIG. 1 ). These steps are summarized below.
Vector construction for each document (or class):
Extract words from document (or combined documents)
Apply morphological analysis (optional)
Construct frequency vector
Assign weights to words to form weighted vector
Compute cosine correlation between test essay vector and the vector of each class
Select class with highest correlation
As discussed in the next section, CV analysis can also be applied to units of text smaller than essays. For example, it can be used to evaluate the rhetorical arguments within an essay. In this case, each argument is treated like a mini-document and is compared to the classes independently of the other arguments. The result is a vector of classes (scores), one for each argument in the essay.
E-rater uses a CV analysis computer program which automatically predicts scores for both argument and issue essays. The scores assigned by the CV analysis program are used as a predictor variable for the set of argument essays.
D. Argument-Content Vector Analysis
An important goal of this invention is to be able to predict essay scores based on “what the writer says.” CV analysis, as it is used above, identifies word associations over the essay as a whole. It looks at words randomly in the essay. Although this tells the reader something about possible essay content, it is important to capture words in a more structured way, so that topic may be identified using closely clustered word groupings.
The scoring rubric specifies that relevant essay content (or relevant words used in an essay) should be well organized and should address relevant content. Therefore, a revised version of the content vector analysis program was implemented and run on the “argument partitioned” training essays for argument and issue essays.
Another content similarity measure, ArgContent, is computed separately for each argument in the test essay and is based on the kind of term weighting used in information retrieval. For this purpose, the word frequency vectors for the six score categories, described above, are converted to vectors of word weights. The weight for word “i” in score category “s” is:
where freqi,s is the frequency of word “i” in category “s,” max_freqs is the frequency of the most frequent word in category “s” (after a stop list of words has been removed), n_essaystotal is the total number of training essays across all six categories, and n_essaysi is the number of training essays containing word “i.”
The first part of the weight formula represents the prominence of word “i” in the score category, and the second part is the log of the word's inverse document frequency (IDF). For each argument “a” in the test essay, a vector of word weights is also constructed. The weight for word “i” in argument “a” is:
where freqi,a is the frequency of word “i” in argument “a,” and max_freqa is the frequency of the most frequency word in “a” (once again, after a stop list of words has been removed). Each argument (as it has been partitioned) is evaluated by computing cosine correlations between its weighted vector and those of the six score categories, and the most similar category is assigned to the argument. As a result of this analysis, e-rater has a set of scores (one per argument) for each test essay. The final score is then calculated as an adjusted mean of the set of scores, represented as ArgContent:
ArgContent=((arg_scores+n _args)/(n _args+1)
This final score output is designated “Essay Score Feature B” (62 in FIG. 1 ).
E. The e-rater System Overview
FIG. 1 shows a functional flow diagram for a preferred embodiment of the e-rater system of the present invention. The first step in automatically scoring an essay is creating a model feature set, i.e. a model feature set used to predict scores at each score point of the scoring rubric. The system starts with a batch of approximately 250-300 original electronic essay text responses (essays already having a known score). Each original electronic essay text 10 is evaluated by EssayContent 20 to perform Content Vector Analysis (as described in Section C above) and to generate “Essay Score Feature A” and is also parsed by the parser 30 to produce a “syntactic” representation of each essay response, denoted as parsed essay text 32 .
Syntactic Feature Analysis 40 (program clause.c) then processes the parsed essay text 32 to extract syntactic information (as described above in Section A entitled “Structural Features”) and creates a vector of syntactic feature counts 42 for each syntatic feature considered by e-rater. Rhetorical Feature Analysis 50 (program gmat.c) also processes the parsed essay text 32 (as described above in Section B entitled “Rhetorical Structure Analysis”) to generate annotated text 52. which includes a vector of rhetorical feature counts 54 and text partitioned into independent arguments 56. This argument partitioned text 56 is then evaluated by ArgContent to perform Argument-Content Vector Analysis (Section D above) to produce “Essay Score Feature B” 62 .
The vector of syntactic features 42. the vector of rhetorical features 54. Essay Score Feature A 22. and Essay Score Feature B 62 are then fed (depicted by the phantom arrows) into a stepwise linear regression 70. from which a “weighted” predictive feature set 72 is generated for each test question using the batch of sample data. The set of weighted predictive features define the model feature set for each test question.
The steps just described above up to the linear regression 70 are then performed for a score to be predicted for each actual essay response. The vector of syntactic features 42. the vector of rhetorical features 54. Essay Score Feature A 22. and Essay Score Feature B 62 for each response are then fed (depicted by the solid arrows) into the score calculation program 80 associated with the model answer for the test question with which the essay is associated and a Final Score 90 between 0 and 6 is generated.
It will be appreciated by those skilled in the art that the foregoing has set forth the presently preferred embodiment of the invention and an illustrative embodiment of the invention, but that numerous alternative embodiments are possible without departing from the novel teachings of the invention. All such modifications are intended to be included within the scope of the appended claims.
APPENDIX A1: ANALYSIS OF AN ARGUMENT ITEM Analysis of an Argument Item Time—30 Minutes
Directions: In this section you will be asked to write a critique of the argument presented below. You are not being asked to present your own views on the subject.
Read the argument and the instructions that follow it, and then make any notes in your test booklet that will help you plan your response. Begin writing your response on the separate answer document. Make sure that you use the answer document that goes with this writing task.
The following is from a campaign by Big Boards, Inc. to convince companies in River City that their sales will increase if they use Big Boards billboards for advertising their locally manufacted products.
Appendix B2: Rules Used By e-rater
A. Extracts “after”, “after”, and “afterwards” if they occur sentence initially as conjunction.
A. Constrains argument extraction for “also”, classified in the lexicon as arg-init#Parallel, and for additional adverbs classified as arg_dev#Belief such that all are extracted if they appear in sentence initial position or if they modify the main verb of the sentence (defined as the first verb that occurs in the second column of the parse tree).
III. LEXICALLY-BASED RULE FOR BEGINNING AN ARGUMENT
a. Constrains the extraction of nouns and pronouns clasified as arg-init#CLAIM words in the lexicon to main clause subject NPs and in sentences beginning with “There”, to the position after a form of the verb “to be”.
a. Controls the extraction and labeling of Nouns in arg_init position that are modified by “this” or “these” that are labeled arg_dev#SAME_TOPIC when they occur in the second or later sentence of a paragraph.
b. If “This”, “These” or “It” occur as a pronoun in the first noun phrase of the parse tree of sentences that are not paragraph-initial, they are output with the label arg_dev#SANE_TOPIC. This label is generated dynamically. “This, “these” and “it” are not stored in the lexicon.
A. Extracts “but” if it is labeled as a conjunction.
VI. COMPLEMENT CLAUSE RULE
A. Extracts complement clauses introduced by “that” as well as complement clauses that do not begin with “that.”
B. Labels complement clause as arg_init#CLAIM_THAT* when it is the first or only sentence of a paragraph, otherwise it is labeled as arg_dev#CLAIM_THAT*
C. Extracts the conjunction “that” if it occurs in a complement clause, or a complement clause not introduced by “that” under the following conditions:
1. the complement clause is not embedded in another COMPCL or SUBCL
2. the complement clause is not further embedded than the third column of the parse tree
VII. SUBORDINATE CLAUSE” RULE FOR BEGINNING AN ARGUMENT
A. If the very first sentence of a paragraph begins with a subordinate clause, extract the noun or pronoun from the main clause NP and consider it to be the beginning of a new argument. The noun or pronoun extracted is labeled arg_init#D-SPECIFIC if it is not listed in the lexicon.
VIII. “FIRST” RULE
A. Constains words listed in lexicon that are classified as arg_init#Parallel words.
B. All words of this category in sentence initial position are exacted (cf ALSO RULE).
C. If the word is not sentence-initial one of the following conditions must be satisfied.
1. It must be in the first constituent of the parse tree, provided that the first consituent is not asbordite clause and that it is not further embedded in the parse tree than the third column.
2. It must be the first NP following a sentence-initial adverb.
3. If the first consistent is the pronoun “I” followed by a verb, then the “FIRST” item must be immediately following the verb.
IX. “FURTHER” RULE
A. Extracts “further” “overall” or “altogether” if they occur sentence-initially and do not modify another constituent.
X. INFINITIVE CLAUSE RULE
A. Extracts an infinitival clause that is not further embedded than the third column of the parse tree and either follows or precedes the main verb of the sentence. The clause is not embedded in a subordinate clause or a complement clause. Infintival clauses that are extracted are labeled as arg_init#To-INFL if it is the first or only sentence of a paragraph, otherwise arg_dev#To_INFL.
XI. RULE FOR BEGINNING AN ARGUMENT AT A NEW PARAGRAPH
A. If a paragraph has no lexical or structural argument initializations then a label arg_init#NEW_PARAGRAPH is applied.
A. Extracts the conjunctions “or” and “either” when they occur in the second column of the parse tree, and the node immediately following the conjunction is a verb phrase.
XIII. PARALLEL TERM RULE
A. Prevents the extraction of arg_init#Parallel lexical entries terms if they modify a verb or a noun at any level of embedding (cf also FIRST.DOC)
XIV. “SHOULD” RULE
A. The words, would, should, might, may, and could are be picked up for each essay. They are classified as arg_aux#SPECULATE in the lexicon.
B. These words occur in parse trees in the structure
A Extracts the conjunction so if it occurs initially in a subordinate clause or if it is a sentence-initial adverb.
A. Extracts “then” if it occurs as an adverb or a conjunction that is not further embedded than the second column of the parse tree.
XVII. VERBING RULE
A. Extracts sentence-initial nouns and verbs ending in “-ing”, as well as “-ing” verbs that immediately follow a prepositional phrase or an adverb that is in the second column of a parse tree. These extracted “-ing” words are labeled as arg_init#CLAIM_Ving if in the first or only sentence of a paragraph, and arg_dev#CLAIM_Ving otherwise.
B. If the base form of the verb is “do”, then the label will be arg_dev#Inference.
A. Extracts all occurrences of “when” in the following structure
I. ABBCL* CONJUNCTION PHRASE* CONJUNCTION* “when” if this structure occurs no further embedded than the fourth column of the parse.
A. Extracts “while” under the following conditions.
1. It is the first constituent of a sentence
2. It is a conjuction in a subordinate clause that is not further embedded than the third column.