

Combining a Discourse Analysis Tool With FEP
A first definition rubric with a preregistered test plan
So, those of you who have been following my writing have seen my effort to reveal the truth about what Real Science is and what is just the false faith of the global institutions. Well, now I have another tool. As some of you know, I have been looking for Real Science. To that end, I created the Design Biology Forensic Evaluation Protocol and then refined it into the Forensic Evaluation Protocol to move beyond biology. So, here we are. I am now combining FEP with a Discourse Analytical Tool. You see, peer review often succeeds or fails on language. Reviewers were tasked with reviewing the claims, tone, certainty words, and how an author handles pushback. They also read the data and the methods.
This article turns that “language sense” into a repeatable tool. It adds a discourse layer to FEP, making weak reasoning easier to spot and fix.
The big idea
A research paper has two layers.
The first layer is the evidence layer. That is data, methods, and analysis.
The second layer is the discourse layer. That is wording, claim size, certainty language, and argument structure.
Reviewers already look at both layers. They handle the discourse layer loosely. This tool makes that part clear, scored, and testable.
What “discourse analysis” means here
Discourse analysis studies language in context, not just sentence-by-sentence grammar. It asks questions like these.
Does the writer define key terms?
Do those terms keep the same meaning later?
Do claims stay inside what the evidence can carry?
Does the author name what would prove them wrong?
This tool focuses on patterns that often show up when a paper is weak, slippery, or more persuasive than testable.
Definitions we will use
Science
In this tool, “science” refers to writing and reporting that do five things.
- States claims with clear limits.
- Links claims to real observable evidence.
- Describes methods so others can check or repeat them.
- Name what would count against the claim.
- Updates beliefs when evidence changes.
This is a practical definition. It focuses on behavior in text and reporting.
Weak science
Weak science remains within a real scientific area but has avoidable quality issues.
It may have unclear terms, missing controls, weak measurement reporting, or inflated conclusions.
Weak science can be honest work that needs revision.
Pseudoscience (risk pattern)
This tool does not label a paper “pseudoscience” by text alone.
It flags risk when the writing shows repeated patterns that block correction, like these.
It avoids clear ways to be proven wrong.
It uses “immunizing” moves to dodge counterevidence.
It cites one side and hides strong contradictions.
It replaces testing with persuasion tactics.
It treats disagreement as evil or a conspiracy without evidence.
Where it fits inside FEP
Use it as a plug-in, not a replacement.
Place it right after the Claim Map to catch claim inflation early.
Place it again after the Evidence Ledger to compare language strength to evidence strength.
Think of it as an early warning system. It helps you decide where to dig deeper.
What the tool does, and does not do
It does
It flags discourse patterns linked to weak reasoning.
It separates “unclear writing” from “bad argument.”
It produces a score with decision rules.
It points to the exact passages that drove the score.
It does not
It does not prove fraud.
It does not guess motives.
It does not replace method reviews, statistical checks, or replication.
How the scoring works
Unit of analysis
Coders score each major section of a manuscript.
Abstract
Introduction
Methods
Results
Discussion
Conclusion
Then they total the scores.
Inputs allowed
The main score uses text only.
A separate optional appendix can include metadata such as conflicts of interest or venue standards.
Text score and metadata score stay separate.
Output
The tool produces three things.
A total Discourse Integrity Score (DIS)
Subscores for specific failure types
A short note that quotes the highest-risk lines
The DIS rubric (0 to 3 per item)
Each item gets a 0–3 score.
0 means “not present.”
3 means “dominant or severe.”
Here are the 10 criteria in plain language.
1) Definition stability
Do key terms keep the same meaning across the paper?
2) Claim scope control
Do claims stay inside what the evidence supports?
3) Falsifiability language
Does the paper say what would count against the claim?
4) Argument validity
Does the paper lean on common reasoning errors?
Examples include straw man arguments, circular logic, false dilemmas, non sequiturs, and the use of authority as a substitute for evidence.
5) Counterevidence handling
Does the paper treat opposing evidence fairly, or dismiss it by ridicule and motives?
6) Rhetoric-to-evidence match
Does the intensity of the language fit the strength of the support?
7) Method-reporting clarity (text layer only)
Can a reader tell what was done, based on the writing?
8) Integrity of causal language
Does the paper claim “causes” when it only shows correlation or speculation?
9) Misuse of jargon
Does technical language hide weak meaning?
10) Social-power framing as evidence
If the paper uses power or identity language, does it treat that language as evidence?
How to read the score
0–8 means low discourse risk.
9–18 means moderate discourse risk.
19–30 means high discourse risk.
A high score does not prove pseudoscience.
It means, “This paper needs a deeper review first.”
Coder rules (to keep it fair)
Coders use a shared glossary.
Coders practice on a training set.
Coders must justify each score with a quoted passage.
Coders do not guess intent. They score text features only.
These rules are not cosmetic. They are the guardrails that make this tool credible.
The preregistered validation plan
This tool only matters if it works under testing. So the plan is preregistered, which means the rules are written down before the results.
Research questions
Can trained coders score papers and agree often?
Does DIS separate mainstream papers from known pseudoscience corpora above chance?
Does DIS predict later signs of weakness, like major corrections?
Corpus design
Group A: established journals across fields
Group B: fringe or predatory venues and known pseudoscience outlets
Group C: borderline cases in mixed-quality venues
The corpus will balance field, year, article type, and author language background when possible.
Blinding
Coders will not see author names, institutions, funding, or journal names.
Another team holds metadata for later.
Procedure
Two coders score each paper.
A third coder breaks ties when scores differ too much.
Time-to-score is recorded to test feasibility.
Reliability targets
The main goal is strong coder agreement on most criteria.
The success target is a high level of agreement after training.
Validity tests
DIS vs group label (A, B, C) using standard classification checks.
For older papers, DIS vs. later outcomes such as retractions, major corrections, and replication results, when available.
Control for field, year, and study design.
Baselines
DIS must beat simple shortcuts like these.
Readability score alone
Length and citation count alone.
Standard reporting checklists, when available
That matters because “good writing” is not the same as “good science.”
Error analysis
Publish top false positives and top false negatives.
Show the failure modes.
Revise the rubric only after the preregistered analysis is complete.
Predictions and falsifiers
Predictions
Trained coders will agree at high rates.
Group B will score higher than Group A.
High DIS will correlate with later correction signals.
DIS will beat writing-quality baselines.
Falsifiers (what would prove the tool failed)
Coders cannot reach a stable agreement after training.
DIS cannot separate groups above chance.
DIS tracks readability or author fluency more than reasoning weakness.
DIS flags controversy and novelty more than poor logic.
DIS fails to predict any later correction or replication signal.
If any falsifier holds, the tool is revised, or its claims are narrowed. For example, it may become “argument clarity screening” instead of “science vs pseudoscience differentiation.”
Limits, risks, and safeguards
Language bias
Non-native English writing can look “messy,” while the science is strong.
We address this by blinding coders and comparing DIS against readability baselines.
Ideology policing
Discourse tools can be abused as political filters.
This rubric avoids that by scoring only features tied to testability, scope, and reasoning.
Overreach
DIS does not define science on its own.
It flags risk and sets review priorities.
Conclusion
Discourse analysis can strengthen the review when it becomes a repeatable check. A clear rubric can detect definitional drift, claim inflation, unfalsifiable framing, and argument defects.
The value of this tool depends on validation. That is why it includes definitions, scoring rules, a preregistered test plan, and falsifiers.
The goal is not to replace method review. The goal is to add a language-forensics layer that makes weak reasoning harder to hide and easier to correct.
