The Regents of the University of California - [2022] APO 17

IP AUSTRALIA
AUSTRALIAN PATENT OFFICE
The Regents of the University of California [2022] APO 17

Patent Application: 2016203450

Title:BamBam: Parallel Comparative Analysis of High-Throughput Sequencing Data

Patent Applicant:                   The Regents of the University of California
Delegate:  Dr V. Z. Kolev
Decision Date:  11 March 2022
Hearing Date:  Written submissions filed on 06 May 2020

Catchwords: PATENTS – sections 45 and 49 – hearing with respect to examiner’s objections – manner of manufacture – lack of unity – inventive step – claim interpretation – clarity – all claims found not clear – examiner’s objections not considered – the examination of the application to continue
Representation: Patent attorney for the applicant: Pizzeys Patent and Trade Mark Attorneys

IP AUSTRALIA
AUSTRALIAN PATENT OFFICE
Patent Application: 2016203450

Title:BamBam: Parallel Comparative Analysis of High-Throughput Sequencing Data

Patent Applicant: The Regents of the University of California
Date of Decision: 11 March 2022
DECISION
The specification of the application does not comply with s 40(3) because all of the claims as proposed to be amended are not clear. The uncertainties surrounding the claimed invention make the evaluation of the validity of the objections raised in Examination report No. 3 impractical. I will not decide on the grounds that were raised as objections in that report, namely: manner of manufacture, lack of unity, and inventive step.
I direct that the examination of the application continues. Under reg 13.4(3), in conjunction with reg 13.4(1)(g), I direct that the period to gain acceptance of the patent request and complete specification in relation to the application is 9 (nine) months from the date of this decision.
REASONS FOR DECISION

Throughout this decision, unless explicitly stated otherwise, any reference to the Act, or to a specific section or subsection, refers to the Patents Act 1990, and any reference to the Regulations, or to a specific regulation or subregulation, refers to the Patents Regulations 1991. In addition, any reference to the Commissioner refers to the Commissioner of Patents as per the Act.
BACKGROUND
The matter relates to patent application 2016203450 (the Application) in the name of The Regents of the University of California (the Applicant). The Application was filed on 26 May 2016 as a divisional application of 2011258875 (now a granted patent). The earliest claimed priority date is 25 May 2010.
The request for examination was filed on 15 July 2016, and the Application was subject to three examination reports as detailed below. It is perhaps worth noting that, due to the raised lack of unity objection, the Examiner limited the search and examination to the invention defined by claims 1 to 8 in all of the reports.
Examination report No. 1 was issued on 21 December 2018. The report contained objections with respect to manner of manufacture, lack of unity, and inventive step. A response to that report was filed on 04 October 2019, together with a set of proposed amendments to the specification of the Application (the Specification). In this first statement of proposed amendments, the Applicant proposed, under item 1, amended claims.
Examination report No. 2 was issued on 08 November 2019. In the report, the Examiner maintained the objections with respect to manner of manufacture, lack of unity, and inventive step. A response to that report was filed on 25 November 2019, together with a second statement of proposed amendments, in which the Applicant proposed, under item 2, a new set of amended claims.
Examination report No. 3 (the Last Report) was issued on 10 December 2019. In the Last Report, the Examiner again maintained the objections with respect to manner of manufacture, lack of unity, and inventive step. As per the Commissioner’s practice, before being issued, the Last Report was reviewed by the Examiner’s supervisor.
In response to the Last Report, on 16 December 2019, the Applicant requested to be heard. The matter was heard by way of written submissions filed on 06 May 2020. The Applicant’s responses to Examination report No. 1 and to Examination report No. 2 were also annexed to the written submissions.
APPLICABLE LAW
On 15 April 2013, the Intellectual Property Laws Amendment (Raising the Bar) Act 2012 commenced which resulted in significant amendments to the Act and the Regulations. The Application was filed on 26 May 2016, hence the amended provisions of the Act and the Regulations apply to the examination of the Application and to the instant hearing decision.
This means that I must accept the Application if I am satisfied, on the balance of probabilities, that the Application complies with the Act. If I am not so satisfied, I may refuse the Application. However, I consider that it would only be appropriate to refuse the Application if I am satisfied that providing the Applicant with an opportunity to overcome any negative finding(s) would serve no useful purpose; in other words, if I consider that any potential negative findings are fatal to the Application.
THE SPECIFICATION
The proposed amendments

10. Given my negative findings with respect to clarity and my decision that the examination of the Application is to continue, the likelihood of further proposed amendments is high. In view of that, I do not consider it necessary to discuss in detail the allowability of the presently proposed amendments, suffice to say that they were not objected to by the Examiner. For the benefit of the Applicant, I will consider the Specification as presently proposed to be amended by proposed amendment item 2.
The invention as described
11. The field of the invention is described as follows:
“[001] The present invention relates to a method for processing data and identifying components of biological pathways in an individual or subject and thereby determining if the individual or subject is at risk for a disorder or disease. The method may be used as a tool to perform a comparative analysis of a individual or subject’s tumor and germline sequencing data using short-read alignments stored in SAM/BAM-formatted files. The method of processing the data calculates overall and allele-specific copy number, phases germline sequence across regions of allelic-imbalance, discovers somatic and germline sequence variants, and infers regions of somatic and germline structural variation. The invention also relates to using the methods to diagnose whether a subject is susceptible to cancer, autoimmune diseases, cell cycle disorders, or other disorders.” (underlining added)

It is probably worth mentioning that “SAM” stands for “Sequence Alignment Map”, whereas “BAM” stands for “Binary Alignment Map”.
12. The section “Background” notes the importance of stratification of cancers based on genomic, transcriptional and epigenomic characteristics of the tumour (at [002]), and explains that:
“[003] With the release of multiple tumor and matched normal whole genome sequences from projects like The Cancer Genome Atlas (TCGA), there is great need for computationally efficient tools that can extract as much genomic information as possible from these enormous datasets (TCGA, 2008). Considering that a single patient’s whole genome sequence at high coverage (>30X) can be hundreds of gigabytes in compressed form, an analysis comparing a pair of these large datasets is slow and difficult to manage, but absolutely necessary in order to discover the many genomic changes that occurred in each patient’s tumor.” (underlining added)

13. The section then describes some specifics associated with breast cancer (at [004]) and concludes with the statement that “[t]here is currently a need to provide methods that can be used in characterization, diagnosis, treatment, and determining outcome of diseases and disorders” (at [005]).
14. The next section “Brief Description of the Invention” starts with the following summary:
“[006] The invention provides methods for generating databases that may be used to determine an individual’s risk, in particular, for example, but not limited to, risk of the individual’s predisposition to a disease, disorder, or condition; risk at the individual’s place of work, abode, at school, or the like; risk of an individual’s exposure to toxins, carcinogens, mutagens, and the like, and risk of an individual’s dietary habits. In addition, the invention provides methods that may be used for identifying a particular individual, animal, plant, or microorganism.” (underlining added)

15. In the subsequent paragraphs, in a manner strongly resembling consistory statements, it is stated that the invention provides: “a method of deriving a differential genetic sequence object” (at [007]); “a method of providing a health care service” (at [0017]); “a method of analyzing a population” (at [0019]); “a method of analyzing a differential genetic sequence object of a person” (at [0021]); “a method of deriving a differential genetic sequence object” (at [0025]); and “a transformation method for creating a differential genetic sequence object, the differential genetic sequence object representing a clinically-relevant difference between a first genetic sequence and a second sequence” (at [0026]). It would appear that the term “differential genetic sequence object” was devised by the present inventors and, later in the decision, I will discuss this term in some detail.
16. It is clarified in paragraph [0022] that “[w]ith respect to the various methods disclosed herein, in a preferred embodiment the patient or person is selected from the group consisting of a patient or person diagnosed with a condition, the condition selected from the group consisting of a disease and a disorder” (underlining added), wherein the condition can be one of a vast variety of conditions listed in paragraphs [0022]-[0024].
17. The section “Detailed Description of the Invention” refers to the drawings, consisting of Figures 1 to 5. In this section, it is reiterated that “[w]ith the release of multiple fully-sequenced tumor and matched normal genomes from projects like The Cancer Genome Atlas (TCGA), there is great need for tools that can efficiently analyze these enormous datasets” (at [0038]), and it is explained that:
“[0039] To this end, we developed BamBam, a tool that simultaneously analyzes each genomic position from a patient’s tumor and germline genomes using the aligned short-read data contained in SAM/BAM-formatted files (SAMtools library; Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9. Epub 2009 Jun 8). BamBam interfaces with the SAMtools library to simultaneously analyze a patient’s tumor and germline genomes using short-read alignments from SAM/BAM-formatted files. In the present disclosure the BamBam tool can be a sequence analysis engine that is used to compare sequences, the sequences comprising strings of information. In one embodiment, the strings of information comprise biological information, for example, a polynucleotide sequence or a polypetide sequence. In another embodiment, the biological information can comprise expression data, for example relative concentration levels of mRNA transcripts or rRNA or tRNA or peptide or polypeptide or protein. In another embodiment, the biological information can be relative amounts of protein modification, such as for example, but not limited to, phosphorylation, sulphation, actylation, methylation, glycosilation, sialation, modification with glycosylphosphatidylinositol, or modification with proteoglycan.
[0040] This method of processing enables BamBam to efficiently calculate overall copy number and infer regions of structural variation (for example, chromosomal translocations) in both tumor and germline genomes; to efficiently calculate overall and allele-specific copy number; infer regions exhibiting loss of heterozygosity (LOH); and discover both somatic and germline sequence variants (for example, point mutations) and structural rearrangements (for example, chromosomal fusions[)]. Furthermore, by comparing the two genome sequences at the same time, BamBam can also immediately distinguish somatic from germline sequence variants, calculate allele-specific copy number alterations in the tumor genome, and phase germline haplotypes across chromosomal regions where the allelic proportion has shifted in the tumor genome. By bringing together all of these analyses into a single tool, researchers can use BamBam to discover many types of genomic alterations that occurred within a patient’s tumor genome, often to specific gene alleles, that help to identify potential drivers of tumorigenesis.
[0041] To determine if a variant discovered is somatic (that is, a variant sequence found only in the tumor) or a germline (that is, a variant sequence that is inherited or heritable) variant requires that we compare the tumor and matched normal genomes in some way. This can be done sequentially, by summarizing data at every genomic position for both tumor and germline and then combining the results for analysis. Unfortunately, because whole-genome BAM files are hundreds of gigabytes in their compressed form (1-2 terabytes uncompressed), the intermediate results that would need to be stored for later analysis will be extremely large and slow to merge and analyze.
[0042] To avoid this issue, BamBam reads from two files at the same time, constantly keeping each BAM file in synchrony with the other and piling up the genomic reads that overlap every common genomic location between the two files. For each pair of pileups, BamBam runs a series of analyses listed above before discarding the pileups and moving to the next common genomic location. By processing these massive BAM files with this method, the computer’s RAM usage is minimal and processing speed is limited primarily by the speed that the filesystem can read the two files. This enables BamBam to process massive amounts of data quickly, while being flexible enough to run on a single computer or across a computer cluster. Another important benefit to processing these files with BamBam is that its output is fairly minimal, consisting only of the important differences found in each file. This produces what is essentially a whole-genome diff between the patient’s tumor and germline genomes, requiring much less disk storage than it would take if all genome information was stored for each file separately.
[0043] BamBam is a computationally efficient method for surveying large sequencing datasets to produce a set of high-quality genomic events that occur within each tumor relative to its germline. These results provide a glimpse into the chromosomal dynamics of tumors, improving our understanding of tumors’ final states and the events that led to them. An exemplary scheme of BamBam Data Flow is shown at Figure 1. ” (original italic, underlining added)

18. Not much further discussion of Figure 1 is provided in the description. The figure is reproduced below, and it would appear that it is showing a somewhat self-explanatory high-level schematic representation of the idea behind BamBam. It is worth noting that no differential genetic sequence object is illustrated on the figure, despite the apparent importance of such objects to the invention.

19. Regarding the differential genetic sequence objects, the Specification explains that:
“[0044] One particular exemplary embodiment of the invention is creation and use of a differential genetic sequence object. As used herein, the object represents a digital object instantiated from the BamBam techniques and reflects a difference between a reference sequence (for example, a first serquence [sic]) and an analysis sequence (for example, a second sequence). The object may be considered a choke point on many different markets. One might consider the following factors related to use and management of such objects from a market perspective:
oAn object can be dynamic and change with respect to a vector of parameters (for example, time, geographic region, genetic tree, species, etc.)
oObjects can be considered to have a ‘distance’ relative to each other objects or reference sequences. The distance can be measured according to dimensions of relevance. For example, the distance can be a deviation from a hypothetical normal or a drift with respect to time.
oObjects can be indicative of risk: risk of developing disease, susceptibility to exposure, risk to work at a location, etc.
oObjects can be managed for presentation to stakeholders: health care providers, insurers, patients, etc.
§Can be presented as a graphical object
§Can be presented in a statistical format: single person, a population, a canonical human, etc.

oA reference sequence can be generated from the objects to form a normalized sequence. The normalized sequence can be built based on consensus derived from measured objects.
oObjects are representative of large sub-genomic or genomic information rather than single-gene alignments and are annotated/contain meta data readable by standard software.
oObjects can have internal patterns or structures which can be detected: a set of mutations in one spot might correlate to a second set of mutations in another spot which correlates to a condition; constellation of difference patterns could be a hot spot; use multi-variate analysis or other AI techniques to identify correlations; detect significance of a hot spot (for example, presence, absence, etc.)
oObjects related to a single person could be used as a security key” (underlining added)

20. The method of the invention can be used to: “acertain [sic] and predict responsiveness of a patient to treatment: anticipated, assumed, predicted, actual, and the like” (at [0046]); “provide patient-specific instructions: prescription, recommendation, prognosis, and the like” (at [0047]); “provide clinical information that can be used in a variety of diagnostic and therapeutic applications” (at [0048]); “provide clinical information to detect and quantify altered gene structures, gene mutations, gene biochemical modifications … for a condition associated with altered expression of a gene or protein” (at [0049]); “provide clinical information to detect and quantify altered gene structures, gene mutations, gene biochemical modifications … for a disorder associated with altered expression of a gene or protein” (at [0050] and [0052]-[0053]); and “provide clinical information for a condition associated with altered expression or activity of the mammalian protein” (at [0051]).
21. The next section is titled “Characterization and Best Mode of the Invention” and it starts with reiterating that “‘BamBam’ is a computationally efficient method for surveying large sequencing datasets to produce a set of high-quality genomic events that occur within each tumor relative to its germline” and that “[t]hese results provide a glimpse into the chromosomal dynamics of tumors, improving our understanding of tumors’ final states and the events that led to them” (at [0054]).
22. The section contains a number of sub-sections (i.e. “Diagnostics”, “Model Systems”, “Toxicology”, “Transgenic Animal Models”, “Embryonic Stem Cells”, “Knockout Analysis”, “Knockin Analysis”, “Non-Human Primate Model”, and “Exemplary Uses of the Invention”), which discuss issues reflective of their respective titles. To the extent these issues could be considered related to the instant invention, it would appear that the whole section is, essentially, devoted to discussions of what could be described as various uses or applications of the invention, despite the fact that only the last sub-section is explicitly titled accordingly.

23. The last section is titled “Examples” and contains 11 examples. While some of these appear to be describing example applications of the invention, others do not appear to have an obvious relevance to the actual invention. The latter are reproduced below in their entirety:
“Example IX: Isolation of Genomic DNA
[00103] Blood or other tissue samples (2-3 ml) are collected from patients and stored in EDTA-containing tubes at -80˚C until use. Genomic DNA is extracted from the blood samples using a DNA isolation kit according to the manufacturer’s instruction (PUREGENE, Gentra Systems, Minneapolis MN). DNA purity is measured as the ratio of the absorbance at 260 and 280 nm (1 cm lightpath; A₂₆₀/A₂₈₀) measured with a Beckman spectrophotometer.
Example X: Identification of SNPs
[00104] A region of a gene from a patient’s DNA sample is amplified by PCR using the primers specifically designed for the region. The PCR products are sequenced using methods well known to those of skill in the art, as disclosed above. SNPs identified in the sequence traces are verified using Phred/Phrap/Consed software and compared with known SNPs deposited in the NCBI SNP databank.
Example XI: Statistical Analysis
[00105] Values are expressed as mean ± SD. χ² analysis (Web Chi Square Calculator, Georgetown Linguistics, Georgetown University, Washington DC) is used to assess differences between genotype frequencies in normal subjects and patients with a disorder. One-way ANOVA with post-hoc analysis is performed as indicated to compare hemodynamics between different patient groups.” (original bold, underlining added)

24. Something about the computer implementation is mentioned in “Example VIII: Computational requirements”, again reproduced below in its entirety:
“[00101] Both BamBam and Bridget [i.e. apparently, another software tool that is of unclear relevance to the instant invention despite being mentioned in the discussions of ‘Example V’ and ‘Example VI’] were written in C, requiring only standard C libraries and the latest SAMtools source code (available from It may be run as a single process or broken up into a series of jobs across a cluster (for example, one job per chromosome). Processing a pair of 250GB BAM files, each containing billions of 100bp reads, BamBam will finish its whole-genome analysis in approximately 5 hours as a single process, or about 30 minutes on a modest cluster (24 nodes). BamBam’s computational requirements were negligible, requiring only enough RAM to store the read data overlapping a single genomic position and enough disk space to store the well-supported variants found in either tumor or germline genomes.
[00102] Bridget also had very modest computational requirements. Runtimes on a single machine were typically less than a second, which includes the time necessary to gather the reference sequence and any potential split-reads in the neighborhood of a breakpoint, build tile databases for both reference and split-reads, determine all dual spanning sets, construct potential junction sequences, re-align all split-reads to both reference and each junction sequence, and determine the best junction sequence. Regions that are highly amplified or have high numbers of unmapped reads increase the running time of Bridget, but this may be mitigated by the easy parallelizability of Bridget.” (underlining added)

The invention as described – summary
25. It would appear that the body of the Specification describes largely a software tool designed to compare two genetic sequences recorded in respective BAM files. A lot of emphasis is given to the high-level functionality of the tool and the numerous benefits that it could bring to its potential users. The operation of the tool is described at a conceptual level, which would imply that developing the necessary software is entirely routine.
The claimed invention
26. The Specification ends with 19 claims, from which claims 1, 9, and 18 are independent and reproduced below (some formatting added to further assist readability and construction):
“1. A method of providing a health care service, comprising:
generating, by an analysis engine, a differential genetic sequence object by:
storing, in a memory of the analysis engine,
a first set of aligned sub-strings from a first genetic sequence string representing a first tissue of a patient and
a second set of aligned sub-strings from a second genetic sequence string representing a second tissue of the patient,
the first set of aligned sub-strings and the second set of aligned sub-strings overlapping a common genomic location;

producing, using the analysis engine, a local alignment
by aligning the first set of aligned sub-strings and the second set of aligned sub-strings through their respective genomic location information using the common genomic location,
as part of incrementally synchronizing the first and second genetic sequence strings at respective known positions
by obtaining and storing sets of aligned sub-strings from the first and second genetic sequence strings that have a next common genomic location;

using, by the analysis engine, the local alignment to generate a local differential string between the first and second genetic sequence strings within the local alignment; and
using, by the analysis engine, the local differential string to update the differential genetic sequence object in a differential sequence database;

providing access to the analysis engine that is informationally coupled to a medical records storage device,
wherein the medical records storage device stores the differential genetic sequence object for the patient;

producing, by the analysis engine, a patient-specific data set
using presence of the local differential string or constellation of a plurality of local differential strings in the differential genetic sequence object for the patient; and

producing, by the analysis engine, a patient-specific instruction for the second tissue based on the patient-specific data set,
wherein the patient-specific instruction is selected from a group consisting of
a diagnosis,
a prognosis,
a prediction of treatment outcome,
a recommendation for a treatment strategy, and
a prescription for the second tissue.

…
9. A method of analyzing a population, comprising:
generating, by an analysis engine, a differential genetic sequence object by:
storing, in a memory of the analysis engine,
a first set of aligned sub-strings from a first genetic sequence string representing a first tissue of an individual patient and
a second set of aligned sub-strings from a second genetic sequence string representing a second tissue of the individual patient,
the first set of aligned sub-strings and the second set of aligned sub-strings overlapping a common genomic location;

producing, using the analysis engine, a local alignment
by aligning the first set of aligned sub-strings and the second set of aligned sub-strings through their respective genomic location information using the common genomic location,
as part of incrementally synchronizing the first and second genetic sequence strings at respective known positions
by obtaining and storing sets of aligned sub-strings from the first and second genetic sequence strings that have a next common genomic location;

using, by the analysis engine, the local alignment to generate a local differential string between the first and second genetic sequence strings within the local alignment; and
using, by the analysis engine, the local differential string to update the differential genetic sequence object in a differential sequence database;

obtaining and storing a plurality of differential genetic sequence objects in a medical records database of a population,
wherein the medical records database is informationally coupled to the analysis engine,
the plurality of differential genetic sequence objects including the differential genetic sequence object;

identifying, by the analysis engine, a constellation of a plurality of local differential strings within the plurality of differential genetic sequence objects to produce a constellation record for the second tissue; and
using, by the analysis engine, the constellation record to generate a population analysis record.
…
18. A method of analyzing a differential genetic sequence object of a person, comprising:
generating, by an analysis engine, the differential genetic sequence object by:
storing, in a memory of the analysis engine,
a first set of aligned sub-strings from a first genetic sequence string representing a first tissue of the person and
a second set of aligned sub-strings from a second genetic sequence string representing a second tissue of the person,
the first set of aligned sub-strings and the second set of aligned sub-strings overlapping a common genomic location;

producing, using the analysis engine, a local alignment
by aligning the first set of aligned sub-strings and the second set of aligned sub-strings through their respective genomic location information using the common genomic location,
as part of incrementally synchronizing the first and second genetic sequence strings at respective known positions
by obtaining and storing sets of aligned sub-strings from the first and second genetic sequence strings that have a next common genomic location;

using, by the analysis engine, the local alignment to generate a local differential string between the first and second genetic sequence strings within the local alignment; and
using, by the analysis engine, the local differential string to update the differential genetic sequence object in a differential sequence database;

storing a reference differential genetic sequence object in a medical records database that is informationally coupled to the analysis engine;
calculating, by the analysis engine, a deviation between
a plurality of local differential strings in the differential genetic sequence object of the person and
a plurality of local differential strings in the reference differential genetic sequence object
to produce a deviation record for the second tissue;

using, by the analysis engine, the deviation record to generate a person-specific deviation profile.”
CLAIM INTERPRETATION AND CLARITY
27. Before I can consider the objections contained in the Last Report, I need to interpret the claims to ascertain their scope. It appears that the main differences between the three independent claims are related to the different uses of the generated differential genetic sequence object. In claim 1, “a differential genetic sequence object” for a patient is used to ultimately produce “a patient-specific instruction”; in claim 9, “a plurality of differential genetic sequence objects in a medical records database of a population” is used to ultimately “generate a population analysis record”; and in claim 18, “a differential genetic sequence object of a person” and “a reference differential genetic sequence object” are used to ultimately “generate a person-specific deviation profile”. It appears that the differential genetic sequence object, as defined in each of these three claims, is generated in the same way. For convenience, I will initially focus on claim 1, and I will discuss the features of claims 9 and 18 that are different from those of claim 1 later.
The relevant law
28. The principles of claim interpretation (sometimes referred to as “the rules of construction”) are well settled. A helpful summary is provided by the Full Court of the Federal Court in Austal Ships Sales Pty Ltd v Stena Rederi Aktiebolag [2008] FCAFC 121, citing with approval from an earlier decision:
“13 In Flexible Steel Lacing Company v Beltreco Ltd (2000) 49 IPR 331, Hely J considered at length the approach to construction of a specification and, in particular, the circumstances in which uncertainty might lead to invalidity. At [71]-[78] his Honour identified the following principles:
·The monopoly must be defined in a way that is not reasonably capable of being misunderstood.
·In determining the nature and extent of the monopoly claimed, the specification must be read as a whole, but recognizing that the parts have different functions. The claims mark out the legal limits of monopoly. What is not claimed is disclaimed. The specification describes how to carry out the process and the best method known to the patentee of doing so.
·Although the claims are construed in the context of the specification as a whole, it is not legitimate to narrow or expand the boundaries of the monopoly as fixed by a claim by adding glosses drawn from other parts of the specification. If a claim is clear, it is not to be varied, qualified or made obscure by statements found elsewhere in the document.
·It is legitimate to refer to the rest of the specification to explain the background to the claims, to ascertain the meaning of technical terms and resolve ambiguities in the construction of the claims. When the language of the claims is obscure or doubtful such doubts may be resolved by reference to the specification.
·It is not necessary that the claims be construed without reference to the body of the specification in order to see whether there is any ambiguity. The document is construed as a whole. If the specification demonstrates an intention that words used elsewhere have a particular meaning, effect should be given to such a ‘dictionary’.

14 At [79]-[81] his Honour then continued:
…
[81] Other principles of construction which may be of assistance in the resolution of the present matter include:
·A patent specification should be given a purposive construction rather than a purely literal one …
·The hypothetical addressee of the patent specification is the non-inventive person skilled in the art before the priority date. The words used in a specification are to be given the meaning which the hypothetical addressee would attach to them, both in the light of his own general knowledge and in the light of what is disclosed in the body of the specification.
·There is a fine line between, on the one hand, reading down the words of a patent claim to reflect how a person skilled in the art would understand it in a practical and commonsense way, and, on the other hand, impermissibly limiting the clear words of a claim because a reader skilled in the art would be likely to apply those wide words only in a limited range of all the situations they describe.
·It is permissible for an invention to be described in a way which involves matters of degree. Lack of precise definition in claims is not fatal to their validity, so long as they provide a workable standard suitable to the intended use. The consideration is whether, on any reasonable view, the claim has meaning. In determining this, the expressions in question must be understood in a practical, commonsense manner. Absurd constructions should be avoided and mere technicalities should not defeat the grant of protection.
·As a general rule, the terms of a specification should be accorded their ordinary English meaning.
·Evidence can be given by experts on the meaning which those skilled in the art would give to technical or scientific terms and phrases and on unusual or special meanings given by such persons to words which might otherwise bear their ordinary meaning.
·However, the construction of the specification is for the court, not for the expert witness. In so far as a view expressed by an expert depends upon a reading of the patent, it cannot carry the day unless the court reads the patent in the same way.
·Section 116 of the 1990 Act provides that the court may, in interpreting a complete specification, refer to the specification without amendment. However, it is neither useful nor legitimate to do so where the amended specification is clear.” (underlining added)

29. The High Court in Interlego AG v Toltoys Pty Ltd (“Lego case”) [1973] HCA 1; (1974) 130 CLR 461 also stated (at p. 479) that:
“14 … If the expression is not clear it is then permissible to resort to the body of the specification to define or clarify the meaning of words used in the claim without infringing the rule that clear and unambiguous words in the claim cannot be varied or qualified by reference to the body of the specification …” (underlining added)

30. Relevantly to my decision, it is important to note that the claims are to be interpreted purposively on the basis of the actual wording chosen by the Applicant as clarified, when necessary, by the body of the Specification.
Interpretation of claim 1
31. In the following analysis, I will discuss the construction of claim 1 with a particular emphasis on some words, phrases, and expressions used in this claim. I will also consider the implications of my construction to the clarity of the claim.
“analysis engine”
32. The analysis engine has a memory; hence it is a physical entity. Based on the functions it performs, I would characterise it as a physical computing tool (in the broadest meaning of the term, e.g. not necessarily a single computing device) configured to perform these functions.
“string” and “sub-strings”
33. In the claim, the words “string” and “sub-strings” are used in the phrases “genetic sequence string”, “set of aligned sub-strings”, and “local differential string”. As a preliminary observation, I note that there is no evidence, and nothing in the body of the Specification suggests, that the word “string” should be interpreted differently from its ordinary English meaning. The same applies to the word “sub-string”. I consider this to be of particular importance to the interpretation of the phrases “genetic sequence string” and “set of aligned sub-strings” given later in the decision.
34. In my opinion, in its plain meaning, “sub-string” simply denotes a string, which is a part of another (e.g. larger) string. Macquarie Dictionary (online version reviewed on 06 May 2021) provides several definitions for the word “string”, of which the following appear most relevant in the context of the claim (original emphasis):
“3. something resembling a string or thread.”
“5. any series of things arranged or connected in a line or following closely one after another: a string of islands; a string of vehicles; to ask a string of questions.”
“6. a set or number, as of animals: a string of racehorses.”

35. It is perhaps worth noting that, in the art of computing, “string” is a type of data or, in other words, a digital format in which information (usually textual) is handled. The values of this type of data (i.e. the string-type values) are ordered sequences of characters (usually up to a certain length), e.g. “Good morning”, “k+ldfi8 8$55-7ksk#”, etc. Each character in the ordered sequence is encoded according to a standard correspondence table or chart (e.g. ASCII). The empty space, as e.g. in “Good morning”, is also considered a character.
36. The local differential string is generated by the analysis engine which, as I have concluded, is a physical computing tool. In addition, the sub-strings are stored in the memory of the analysis engine. It is therefore clear that the local differential string and the sub-strings are digital entities, and this somewhat weighs in favour of the word “string” being interpreted broadly in line with its meaning in the art of computing. On the other hand, the claim does not define that the local differential string, when generated, is necessarily a string-type value, or that the sub-strings are necessarily stored in the memory as string-type values. Hence, I conclude that the local differential string and the sub-strings are digital entities, each representing an ordered sequence of characters, regardless of the format in which they are stored in the memory or otherwise handled by the analysis engine. For brevity, I will also use the phrases “local differential string” and “sub-string” to refer to the actual ordered sequences of characters represented respectively by these two digital entities.

“sub-strings from a … genetic sequence string”
37. Relevantly, what is stored in the memory of the analysis engine is “a first set of aligned sub-strings from a first genetic sequence string representing a first tissue of a patient and a second set of aligned sub-strings from a second genetic sequence string representing a second tissue of the patient” (underlining added). I consider that the word “from” clearly indicates that each sub-string somehow originates from its respective genetic sequence string. This, in combination with the meaning of the word “sub-string”, which I have already discussed, strongly suggests that each sub-string is a part of the respective genetic sequence string, and I have no reasons to adopt a different construction.
38. An important consequence of the above is that the method of claim 1 requires, for “generating, by an analysis engine, a differential genetic sequence object”, the pre-existence of the first and the second genetic sequence strings representing, respectively, the first and second tissues of the patient. This appears to be further reinforced by the expression “incrementally synchronizing the first and second genetic sequence strings at respective known positions by obtaining and storing sets of aligned sub-strings from the first and second genetic sequence strings that have a next common genomic location” (underlining added).
“genetic sequence string”
39. Since each sub-string originates from its respective genetic sequence string and is a part of that string, it is reasonable to infer that the genetic sequence strings are also ordered sequences of characters, although they might not necessarily be digital entities because they do not appear to be directly handled by the analysis engine. In its ordinary English meaning, the phrase “genetic sequence string” is simply a string that is a genetic sequence, hence I will interpret it as an ordered sequence of characters which form a genetic sequence. While the genetic sequence strings represent tissues of a patient, they may or may not represent the complete genome of these tissues. It is possible, for example, that only a few specific genes are included in one or both of the genetic sequence strings, with the gaps between the neighbouring genes filled in with empty spaces or other characters representing or somehow encoding the fact that genetic information is missing.
40. The above interpretation of the phrase “genetic sequence string” is based on the ordinary English meaning of the word “string”. Importantly, there is no evidence before me that the phrase in question has a special meaning in the art such that the phrase could denote something which, in fact, is not a “string” within the ordinary meaning of this word. Therefore, I must conclude that each genetic sequence string is indeed a single string, hence it is a single continuous ordered sequence of characters. To further clarify my interpretation, in the lack of any evidence to the contrary, I do not consider that the genetic sequence information recorded in a SAM/BAM file is a genetic sequence string. The structure of such a file is more complex and the genetic information is not represented as a single continuous ordered sequence of characters. Instead, such information could possibly be broadly described as being represented as a collection of genetic sequence strings. I must emphasise that this distinction is not minor or academic, and it deeply affects the claim interpretation and clarity as it will be seen later in this decision.
“set of aligned sub-strings”
41. I have already noted that I do not consider that the genetic sequence information recorded in a SAM/BAM file is a genetic sequence string. Similarly, despite the Applicant’s apparent suggestion to the contrary (see later in this decision), I do not consider that the sequence reads in a SAM/BAM file can be referred to as aligned sub-strings from a genetic sequence string or, in other words, aligned sub-strings from a pre-existing single continuous ordered sequence of characters which form a genetic sequence.
42. The interpretation of the phrase “set of aligned sub-strings” is made difficult mainly by the fact that each one of the sub-strings in the set originates from the same genetic sequence string, i.e., from the same single continuous ordered sequence of characters, as opposed to originating from a BAM file or a similar record of genetic information representing sequence reads. For example, an issue arises of how a set of sub-strings can be generated from a single string such that the sub-strings are somehow aligned.
43. While I accept that phrases such as “sequence alignment”, “global alignment”, “local alignment”, and the like may possibly be considered terms used in the art, I am unsure of their direct applicability to the present case where, as mentioned above, what must be aligned is sub-strings each originating from the same genetic sequence string. Therefore, in interpreting the phrase in question, I need to use the plain English meaning of the words in it.
44. Macquarie Dictionary (online version reviewed on 12 May 2021) provides a large number of definitions for the word “set”. I consider that the most appropriate definitions in the context of the claim are (original emphasis):
“57. a number of things customarily used together or forming a complete assortment, outfit, or collection: a set of dishes.”
“78. Mathematics any collection of numbers or objects which have some common property.”

45. The phrase “set of aligned sub-strings” means plainly that the sub-strings in the set are aligned. Macquarie Dictionary (online version reviewed on 05 October 2021) provides definitions for the word “align” as follows (original emphasis):
“1. to adjust to a line; lay out or regulate by line; form in line.”
“2. to bring into line.”
“3. Politics to bring into line with a particular tradition, policy, group or power.”
“4. to adjust (mechanical items such as motor vehicle wheels) so that as a group they are in positions favouring optimum performance.”
“5. to fall or come into line; be in line.”
“6. to join with others in a cause.”

“set of aligned sub-strings” – what is defined in the claim?
46. The claim defines that “the first set of aligned sub-strings and the second set of aligned sub-strings [are] overlapping a common genomic location” (underlining added) and also defines “aligning the first set of aligned sub-strings and the second set of aligned sub-strings through their respective genomic location information …” (underlining added). This appears to suggest that the way, in which the sub-strings are aligned, must be related to the genomic location and the information about it. Hence, it is the genomic locations in different sub-strings that must be brought into line (see e.g. the second definition above) in order to align the sub-strings. I also note that such an interpretation does not appear clearly inconsistent with the meaning of the above mentioned terms in the art that use the word “alignment”.
47. Generally, the sub-strings in the set could be aligned either with respect to something else (e.g. some common reference) or with respect to each other. The fact that the claim does not explicitly define an entity with respect to which the sub-strings are aligned, prima facie, suggests that they are aligned with respect to each other. Nonetheless, I will also consider all other possibilities which could be characterised as somewhat reasonable.
48. As one such possibility, since the claim defines that “the first set of aligned substrings and the second set of aligned sub-strings [are] overlapping a common genomic location” (underlining added), it could possibly be argued that, because of that, the sub-strings in the set are each aligned with respect to that location.
49. Although a term used in the art, “genomic location” is still open to interpretation to a certain degree. In the same way as a location on a geographical map is not necessarily limited to the location of a single geometrical point on the map (e.g. it could be a location of a city, a country, etc.), the term “genomic location” is not limited to the location of a single base in the genome. It could also refer to a location of a gene, a region of several consecutive bases, etc. in the genome.
50. The word “overlap” is given the following definitions in Macquarie Dictionary (online version reviewed on 12 May 2021, original emphasis):
“1. to lap over (something else or each other); extend over and cover a part of.”
“2. to cover and extend beyond (something else).”
“3. to coincide in part with; correspond partly with.”
“4. to lap over.”
“5. an overlapping.”
“6. the extent or amount of overlapping.”
“7. an overlapping part.”
“8. the place of overlapping.”

51. Following the above definitions, I consider that “to overlap” does not necessarily mean to cover completely, hence the overlap may be a partial overlap when the genomic location is larger than the location of a single base (i.e. a single character in the genetic sequence string).
52. Importantly, the ordinary English meaning of the above expression “the first set of aligned sub-strings and the second set of aligned sub-strings [are] overlapping a common genomic location” suggests that it is the two sets that are overlapping the common genomic location. I consider that a set of sub-strings is overlapping a genomic location if at least one of the sub-strings in the set is overlapping that location, which means that not all sub-strings in the set must necessarily overlap that location. Hence, as not all sub-strings in the set need to overlap the common genomic location, it would appear that the above expression (i.e. “the first set of aligned sub-strings and the second set of aligned sub-strings [are] overlapping a common genomic location”) does not impose the limitation that the sub-strings in each set are aligned with respect to that location.
53. Furthermore, the claim later defines “producing … a local alignment by aligning the first set of aligned sub-strings and the second set of aligned sub-strings through their respective genomic location information using the common genomic location” (underlining added). However, if the sub-strings in the sets were each aligned with respect to the common genomic location, then the two sets of aligned sub-strings would be already aligned using the common genomic location. Hence, the defined further aligning of the first and the second sets of aligned sub-strings would appear somewhat unnecessary.
54. As another possibility, it might be argued that the sub-strings in the set are each aligned with respect to the genetic sequence string they come from. However, since these sub-strings are from the same genetic sequence string, each sub-string would always be somehow aligned to the string, and the word “aligned” would be redundant. Furthermore, such an interpretation would appear to be against the principle of purposive construction, as it would allow for a random selection of sub-strings from the genetic sequence string to be present in the set, as long as one of the sub-strings in the set overlaps the common genomic location. As explained above, this will be sufficient to satisfy the requirement, defined in the claim, that the set of aligned sub-strings must be overlapping the common genomic location.
55. All this strongly suggests that the sub-strings in each set are indeed aligned with respect to each other. Importantly, this will only be possible if each set contains more than one sub-string. Hence, on the basis of the wording of the claim, I conclude that each set of aligned sub-strings contains at least two sub-strings aligned with respect to each other.
56. It would appear reasonable to consider that the sub-strings in a set will be aligned with respect to each other if these sub-strings are at least partially overlapping such that they can be superimposed or overlaid to produce a continuous sequence of characters with no gaps. If that is not the case, then there would exist sub-sets of sub-strings within the set, such that no sub-string from one sub-set overlaps with any sub-string from another sub-set. I consider that a set of sub-strings having such sub-sets would be unreasonable to characterise as a “set of aligned sub-strings”. In that situation, it would appear that the sub-strings within each sub-set would be aligned with respect to each other, but this would not apply to the sub-strings of the whole set. Furthermore, the lack of overlap between sub-strings from different sub-sets within such a set would result in the possibility of sub-strings in all but one sub-sets being selected completely at random, which again goes against the principles of purposive construction.
57. An example drawing of what appears to be defined in the claim is presented below. Despite the requirements for: (i) mutual alignment of the sub-strings within each set; and (ii) the two sets overlapping a common genomic location, a certain residual degree of randomness appears unavoidable. This is because the claim does not define a specific way in which the sub-strings in a particular set are to be selected from the respective genetic sequence string for inclusion in the set.
58. The claim further defines:
“producing, using the analysis engine, a local alignment by aligning the first set of aligned sub-strings and the second set of aligned sub-strings through their respective genomic location information using the common genomic location,
…
using, by the analysis engine, the local alignment to generate a local differential string between the first and second genetic sequence strings within the local alignment” (underlining added)

59. I consider that the only reasonable interpretation of the above is that the analysis engine compares the two sets of aligned sub-strings, which are further aligned using the common genomic location (thus producing a local alignment), to generate the local differential string. It is important to note however that, since all sub-strings in a set are parts of the same genetic sequence string, all sub-strings in the set will always have the same character at the same single base genomic location as shown on the example drawing above. Therefore, on purposive construction, it is unclear why a set of aligned sub-strings from each of the first and second genetic sequence strings is needed for this comparison, instead of a single sub-string from each of the first and second genetic sequence strings, the single sub-string overlapping as much as possible the common genomic location. The claimed requirement for a set of sub-strings only adds to the degree of randomness in the selection of sub-strings from each string, and I would even go as far as saying that, on purposive construction, using sets of aligned sub-strings as defined in the claim makes no sense, unless each set always contains only one sub-string. However, as I have already discussed, the wording of the claim weighs strongly against such an interpretation of the phrase “set of aligned sub-strings”. Indeed, there is no reason to use this phrase in the claim if any set can only ever have one sub-string.
60. In summary, the wording of the claim insists on interpretation which makes no sense on purposive construction; hence I need to look for clarification in the body of the Specification.
“set of aligned sub-strings” – what is described in the body of the Specification?
61. It appears that the phrase “set of aligned sub-strings” is not used in the body of the Specification at all. Instead, the description mentions that “the first and second sequence strings have a plurality of corresponding sub-strings”, and also “producing, using the sequence analysis engine, a local alignment by incrementally synchronizing the first and second sequence strings using a known position of at least one of plurality of corresponding sub-strings”, wherein “the corresponding sub-strings comprise homozygous alleles” or “comprise heterozygous alleles” (at [007], underlining added). It is also stated that:
“[008] In a preferred embodiment, the step of synchronizing comprises aligning at least one of the plurality of sub-strings is based on an a priori known location within the first string. In an alternative preferred embodiment the step of synchronizing comprises aligning at least one of the plurality of sub-strings based on a known reference string comprising known locations for the at least one of the plurality of sub-strings. In a more preferred embodiment, the known reference string is a consensus sequence.
[009] In another preferred embodiment, the step of synchronizing comprises aligning the at least one of the plurality of sub-strings within a window having a length of less than a length of the at least one of the plurality of sub-strings.” (original italic, underlining added)

Similar statements can also be found in paragraphs [0025]-[0026].
62. Firstly, it is worth noting that all of the above paragraphs come from the section “Brief Description of the Invention” and are written in a format strongly suggestive of consistory statements. Secondly, the word “aligned” (as in the context of a “set of aligned sub-strings”) is missing. Thirdly, the word “plurality” is different from the word “set” that is used in the claim.
63. Macquarie Dictionary (online version reviewed on 03 June 2021) provides the following definitions for “plurality” (original emphasis):
“1. more than half of the whole; the majority.”
“2. → majority (def. 3).”
“3. a number greater than unity.”
“4. the fact of being numerous.”
“5. a large number, or a multitude.”
“6. the state or fact of being plural.”
“7. → pluralism (def. 3).”
“8. any of the offices or benefices held under a system of pluralism (def. 3).”

64. All of the above definitions (and especially the third definition) clearly suggest that a plurality of sub-strings must contain more than one sub-string.
“set of aligned sub-strings” – conclusion
65. What is described in the body of the Specification appears to be somewhat different from what is claimed. At best, the word “plurality” could be considered supportive of my earlier finding that the set of aligned sub-strings, as claimed, must contain more than one sub-string, thus reinforcing my view that the claim makes no sense on purposive construction.
66. On balance, I conclude that the wording of claim 1, and in particular the phrase “set of aligned sub-strings” as used in the claim, leads to clarity issues that cannot be properly resolved.
“local differential string”
67. I have already found that the local differential string is a digital entity. I consider that it is a digital entity representing an ordered sequence of characters, which is indicative of the differences between the first and the second genetic sequence strings within the local alignment, as determined by a comparison of the first and the second sets of aligned sub-strings. No specific way of indicating or encoding these differences in the local differential string is defined.
“differential genetic sequence object” and the related processes
68. The entity referred to as a “differential genetic sequence object” and its use appear to be pivotal to the instant invention. In what follows, I will consider the nature, content, and the creation or generation of this entity.
Does the body of the Specification provide a “dictionary” for the meaning of the term “differential genetic sequence object”?
69. As I already mentioned, with respect to this term the description states:
“[0044] One particular exemplary embodiment of the invention is creation and use of a differential genetic sequence object. As used herein, the object represents a digital object instantiated from the BamBam techniques and reflects a difference between a reference sequence (for example, a first serquence [sic]) and an analysis sequence (for example, a second sequence).” (underlining added)

70. It is important for me to decide whether the above is to be regarded as a “dictionary” defining the meaning of a term or as a mere explanation given as part of describing the invention. From the outset, I must note that the somewhat broad and vague language of paragraph [0044] (quoted in full earlier) is lacking the clarity and precision required for a proper “dictionary” and is much more suggestive of a mere explanation. It might be worth mentioning that this paragraph even includes considerations about the “factors related to use and management of such [differential genetic sequence] objects from a market perspective”.
71. In addition, the text in paragraph [0044] describes the differential genetic sequence object as “a digital object instantiated from the BamBam techniques”, i.e., by referring to other parts of the description and possibly the drawings. Hence, it could even be argued that if this text is to be considered a “dictionary” definition, then the claim could be deficient with respect to s 40(3A).
72. Furthermore, “the BamBam techniques”, from which the differential genetic sequence object is “instantiated”, involve the comparison of two BAM files (see Figure 1 above as well as the description in paragraphs [0039]-[0043] quoted in full earlier). However, BAM files are not defined in the claim at all, hence it would appear that the term “differential genetic sequence object” as used in the claim may not necessarily refer to the same entity as the differential genetic sequence object described in the body of the Specification. This could potentially be confusing.
73. I conclude that the description of the differential genetic sequence object provided in the body of the Specification is indeed just an illustrative explanation that is non-exhaustive and only intended to facilitate the understanding of the described invention. I do not consider that it is a “dictionary” defining the term in question.
What is the nature of the “differential genetic sequence object” and what does it contain?
74. As defined in the claim, the differential genetic sequence object is generated/updated by the analysis engine, hence it is clearly a digital entity. The claim further defines “producing, by the analysis engine, a patient-specific data set using presence of the local differential string or constellation of a plurality of local differential strings in the differential genetic sequence object” (underlining added). This suggests that the differential genetic sequence object contains at least one local differential string. However, it may also contain a constellation of a plurality of local differential strings, and this leads to the next consideration.
“constellation of a plurality of local differential strings”
75. There is no evidence that the word “constellation” (especially when used in the context of a plurality of local differential strings) has a specific well-defined meaning in the art, and the Specification does not appear to suggest that. Macquarie Dictionary (online version reviewed on 10 May 2021) provides the following definitions for this word (original emphasis):
“1. Astronomy
a. any of various groups of stars to which definite names have been given, as the Southern Cross.
b. a division of the heavens occupied by such a group.”
“2. Astrology
a. the grouping or relative position of the stars as supposed to influence events, especially at a person’s birth.
b. Obsolete character as supposed to be determined by the stars.”
“3. any brilliant assemblage.”
“4. Psychology a group of emotionally coloured ideas, mostly repressed.”

76. It appears to me that none of the above definitions can directly assist in providing meaningful interpretation of the expression in question. In addition, since the word “constellation” itself is somewhat suggestive of a group (i.e. a plurality) of stars or other objects, it is unclear what could be a “constellation of a plurality of local differential strings” (underlining added), and how this differs from simply “a plurality of local differential strings”. It would appear that the constellation “in the differential genetic sequence object for the patient” must be an entity somehow derived from the plurality of local differential strings. For example, it may well be possible that the Applicant intended that the constellation should refer to a group of somehow related local differential strings from the plurality of local differential strings, or even a set of such groups, however none of these possibilities can be unambiguously ascertained from the claim language. I consider that the expression “constellation of a plurality of local differential strings”, considered in the context of the claim, is unclear. Depending on how this unclear expression is used in the claim, the result may or may not be that the scope of the claim is uncertain, and I will consider this matter below.
77. The expression “producing … using presence of the local differential string or constellation of a plurality of local differential strings” (underlining and bold added) appears, prima facie, to define two options, the second being defined by using the unclear expression. However, if on proper interpretation, the claim language would imply that “the local differential string” must be one of the “plurality of local differential strings”, then it could potentially be argued that “using presence of … constellation of a plurality of local differential strings” would inevitably involve “using presence of the local differential string”, and the second option would only be narrowing the scope of the claim in comparison to the first option. In this case, it could be argued that the scope of the claim would be determined by the broader first option, and the unclear second option would not lead to uncertainties in the scope of the claim.
78. Importantly though, I am unable to find anything in the wording of the claim that would suggest that the entity referred to as “the local differential string” must necessarily be part of the defined “plurality of local differential strings”, i.e., this plurality may include only local differential strings that are different from “the local differential string”. In addition, it could also be argued that even if “the local differential string” is one of the “plurality of local differential strings”, it does not necessarily follow that the “presence of the local differential string” will be used within the second option (i.e. the defined constellation might only use other local differential strings from the plurality).
79. As the claim language allows interpretation, according to which the two options are indeed alternatives both affecting the scope of the claim, I am of the view that this interpretation should be preferred in the circumstances of the case. The alternative interpretation, in which the scope of the claim is determined by the first option, requires in essence that the whole expression “… or constellation of a plurality of local differential strings” is disregarded for the purpose of determining the scope of the claim. In the absence of any clear justification for this in the context of the claim, I do not find such an interpretation satisfactory.
80. I conclude that the claim defines two options, both affecting the scope of the claim. Since the second option is defined using the unclear expression, it follows that the claim, based on its wording, is not clear.
81. I will now attempt to clarify the expression that results in the lack of clarity by referring to the body of the Specification. The description mentions the word “constellation” on seven occasions as reproduced below (underlining of the word added):
“[0017] … producing, by the analysis engine, a patient-specific data set using presence of a local differential string or constellation of a plurality of local differential strings in the differential genetic sequence object for the patient; …
…
[0019] … identifying, by the analysis engine, a constellation of a plurality of local differential strings within the plurality of differential genetic sequence objects to produce a constellation record; and using, by the analysis engine, the constellation record to generate a population analysis record. …
[0020] In an alternative embodiment the method disclosed herein further comprises a step of comparing a constellation record of an individual patient with the population analysis record. In a preferred embodiment, the step of comparing of the constellation record of the individual patient with the population analysis record creates a patient-specific record. …
…
[0044] …
…
oObjects can have internal patterns or structures which can be detected: a set of mutations in one spot might correlate to a second set of mutations in another spot which correlates to a condition; constellation of difference patterns could be a hot spot; use multi-variate analysis or other AI techniques to identify correlations; detect significance of a hot spot (for example, presence, absence, etc.)
…”

82. The only mentioning of the word “constellation” that is not part of a consistory statement appears to be in paragraph [0044], which I have reproduced earlier in its entirety. As I have already noted, the language in this paragraph is rather broad and vague, including expressions like “[differential genetic sequence] [o]bjects can have internal patterns or structures which can be detected”, “constellation of difference patterns could be a hot spot”, “use multi-variate analysis or other AI techniques to identify correlations”, etc. These are merely suggestive of the data analysis that could potentially be performed on the differential genetic sequence object. Any discussions appear to be at a conceptual level with a lot of emphasis on the potential use of the differential genetic sequence objects and the benefits of that use. The actual specific implementation of the concepts, including the precise content and structure of the differential genetic sequence object, appears to be left to the skilled addressee. It is also worth mentioning that the phrase used in this paragraph, i.e., “constellation of difference patterns”, appears somewhat different to the phrase used in the claim, i.e., “constellation of a plurality of local differential strings”.
83. In summary, the above quoted parts and the rest of the body of the Specification do not appear to provide any additional useful information about the nature of the entity referred to by the expression “constellation of a plurality of local differential strings”. I conclude that the Specification does not provide sufficient information to allow me to give unambiguous interpretation of the expression in question as used in the claim, hence I find that the use of this unclear expression results in claim 1 being not clear.
“generating … a differential genetic sequence object”
84. Importantly, the process defined as “generating … a differential genetic sequence object” appears to be accomplished exclusively by “updat[ing] the differential genetic sequence object in a differential sequence database”, which implies the pre-existence of that object. Such wording creates reasonable doubt as to whether or not the step that generates the initial differential genetic sequence object (before any updating) may be missing from the claim. Attempts to clarify this by reference to the body of the Specification prove unsuccessful as the generation of the differential genetic sequence object does not appear to be mentioned outside of the consistory statements. As I have already noted, the body of the Specification puts more emphasis on the potential use of the differential genetic sequence objects and the benefits of such use, the actual specific implementation being left to the skilled addressee.
85. It could possibly be argued that, on purposive construction, what is claimed is the generation of the differential genetic sequence object by starting with using a single local differential string, which must be “the local differential string” since it is the first to be generated in the claim, to create or generate the initial differential genetic sequence object, and then incrementally updating the (initial) differential genetic sequence object by using other, subsequently generated, local differential strings within local alignments corresponding to other common genomic locations. However, the claim clearly defines that “the local differential string” is used “to update the differential genetic sequence object in a differential sequence database” (underlining added), which means that the differential genetic sequence object already existed, e.g., possibly even before the generation of “the local differential string”. In my view, this clear and unambiguous wording is not something that can be ignored; hence it appears that the claimed process of “generating … a differential genetic sequence object” does not actually result in the generating of such an object. On balance, I consider that the above issues render the claim not clear.
“incrementally synchronizing the first and second genetic sequence strings at respective known positions”
86. This is another process defined in the claim, which process appears to be somewhat intertwined with the process of “generating … a differential genetic sequence object”. The word “synchronising”, as used with respect to two genetic sequence strings, does not appear to be a term with a well-defined meaning in the art. Macquarie Dictionary (online version reviewed on 28 May 2021) defines the word “synchronise” as follows (original emphasis):
“1. to occur at the same time, or coincide or agree in time.”
“2. to go on at the same rate and exactly together; recur together.”
“3. to cause to indicate the same time, as one clock with another.”
“4. to cause to go on at the same rate and exactly together.”
“5. to cause to agree in time of occurrence; assign to the same time or period, as in a history.”

87. The above definitions appear to insist on time-related context and could be applicable to situations of synchronising events or processes but are difficult to apply with respect to “synchronising” two genetic sequence strings, which do not occur or extend in time. Hence these definitions are of little assistance in the present case. From the claim language, the use of the word “synchronise” appears somehow related to the alignment, because “aligning the first set of aligned sub-strings and the second set of aligned sub-strings …” is done “as part of incrementally synchronizing the first and second genetic sequence strings at respective known positions” (underlining added).
88. All this suggests that the word “synchronising” may not be the most suitable word to denote the process defined in the claim. In addition, the definition of the process itself creates some interpretation issues as discussed below.
The relationship between the processes of “generating … a differential genetic sequence object” and “incrementally synchronizing the first and second genetic sequence strings”
89. I note that “aligning the first set of aligned sub-strings and the second set of aligned sub-strings through their respective genomic location information using the common genomic location” is clearly defined as a step of the process of “generating … a differential genetic sequence object”. However, the same step is also performed “as part of incrementally synchronizing the first and second genetic sequence strings at respective known positions”. In other words, the same step appears to be defined as part of the two processes under consideration and this, in my view, creates a number of issues.
90. Firstly, while it is clear that these two processes are somewhat intertwined or overlapping (i.e. they have at least one common step), the degree of overlap is uncertain – for example, in the extreme case scenario, it may be possible that “generating … a differential genetic sequence object” and “incrementally synchronizing the first and second genetic sequence strings” are, in fact, two names given in the claim to the same process. Secondly, it is unclear whether the process of “incrementally synchronizing …” as defined in the claim is actually limiting the scope of the claim in any way and if so, what is the relationship between the defined “common genomic location” and “next common genomic location”, i.e., “next” with respect to what, and how many common genomic locations the two genetic sequence strings have – only two as defined (the “common genomic location” and the “next common genomic location”) or they could potentially have more than two.
91. In addition, producing a local alignment is done “by aligning the first set of aligned sub-strings and the second set of aligned sub-strings through their respective genomic location information using the common genomic location” (underlining added) and, as I already noted, this is also part of the process of “incrementally synchronizing …”. However, when the claim defines how the process of “incrementally synchronizing …” is performed, it is said that this happens “by obtaining and storing sets of aligned sub-strings from the first and second genetic sequence strings that have a next common genomic location” (underlining added) and no other steps of the process are mentioned. This means, for example, that whether or not the “sets of aligned sub-strings from the first and second genetic sequence strings that have a next common genomic location” are actually also aligned using, by implication, the next common genomic location remains uncertain.
92. In summary, the way in which the processes of “generating … a differential genetic sequence object” and “incrementally synchronizing the first and second genetic sequence strings” are defined creates ambiguities resulting in claim 1 being not clear.
Interpretation of the rest of the claims
93. As I have already mentioned, it appears that the process of generating the differential genetic sequence object is defined in the same way in all three independent claims. The same could be said for the process of incrementally synchronising the two genetic sequence strings. Hence, all clarity issues related to these processes (including in the phrases and expressions used to define them) that I have identified for claim 1 are also applicable to claims 9 and 18.
94. In addition, claim 9 defines “identifying, by the analysis engine, a constellation of a plurality of local differential strings within the plurality of differential genetic sequence objects to produce a constellation record for the second tissue” (underlining added). This creates similar clarity problems, related to the use of the word “constellation”, as those discussed in reference to claim 1.
95. Further, claim 18 defines “storing a reference differential genetic sequence object in a medical records database … ; calculating … a deviation between a plurality of local differential strings in the differential genetic sequence object of the person and a plurality of local differential strings in the reference differential genetic sequence object to produce a deviation record for the second tissue” (underlining added). While the claim defines that the differential genetic sequence object of the person is related to a first and second genetic sequence strings representing, respectively, a first and second tissue of the person, nothing is said about the reference differential genetic sequence object. For example, it is not clear how it is created/generated or which two genetic sequence strings it is related to.
96. It would appear that the body of the Specification mentions the phrase “reference differential genetic sequence object” only in paragraph [0021], which I have reproduced below in its entirety:
“[0021] The invention further provides a method of analyzing a differential genetic sequence object of a person, the method comprising: storing a reference differential genetic sequence object in a medical records database that is informationally coupled to an analysis engine; calculating, by the analysis engine, a deviation between a plurality of local differential strings in the differential genetic sequence object of the person and a plurality of local differential strings in the reference differential genetic sequence object to produce a deviation record; using, by the analysis engine, the deviation record to generate a person-specific deviation profile. In a preferred embodiment, the reference differential genetic sequence object is calculated from a plurality of local differential strings of the person. In another preferred embodiment, the reference differential genetic sequence object is calculated from a plurality of local differential strings of the person.” (underlining added)

97. With respect to the above quotation, it remains uncertain how “the reference differential genetic sequence object [that] is calculated from a plurality of local differential strings of the person” differs from the “differential genetic sequence object of a person”, which supposedly also must be somehow generated/updated using local differential strings of the person. As it can be seen from the above, similarly to the other instances of unclear claim wording, the body of the Specification provides little assistance with the interpretation of the phrase “reference differential genetic sequence object”. This results in additional clarity issue for claim 18.
98. The dependent claims also suffer from the clarity issues of their respective independent claims as it does not appear that any of the additional features defined in the dependent claims has the potential to resolve these “inherited” clarity issues. Furthermore, depending on the interpretation or possible future proposed amendments to the independent claims, it may be the case that some of those additional features could lead to further clarity issues; however I consider that any such issues, if they indeed exist or emerge, should be properly addressed as part of the continued examination process.
The Applicant’s submissions on claim interpretation
99. As it could be expected, given that clarity was not raised as an objection, the Applicant’s submissions on claim construction are not comprehensive. Nonetheless, the Applicant provided some comments in relation to the interpretation of certain features of claim 1, for example:
“Claim 1 defines that respective sets of multiple sub-strings (e.g. sequence reads) from the respective tissues overlap with a common genomic location, that is, a genomic location common to all of the overlapping sub-strings of the two sets.
The claim specifically defines amongst other features:
‘incrementally synchronizing the first and second genetic sequence strings at respective known positions by obtaining and storing sets of aligned sub-strings from the first and second genetic sequence strings that have a next common genomic location’.

This is performed in a manner that is computationally efficient, so that less memory and less processing time is required. For example, the method can move from one genomic location to a next genomic location. At each genomic location, the sub-strings (e.g., sequence reads) that overlap with that genomic location (i.e., have that location in common) are obtained and stored in memory. Other sub-strings not overlapping that genomic location are not obtained and stored, so that less memory can be used and the memory can be accessed faster. In this manner, such sets of sub-strings from the two sequence strings can be compared to each other to identify any differences at the common genomic location between the two tissues. Then, for the next genomic location, only the sub-strings overlapping this next genomic location are obtained and stored, and the other sub-strings can be removed from memory.” (pages 4-5, original italic, underlining added)
“The claimed invention is directed to efficiently identifying genomic differences between two tissues using measured sub-strings of DNA molecules, …
…
We submit that:
(a) In D2, there are only two sequences, and not a set of sub-strings (e.g., sequence reads of DNA molecules) that correspond to a sequence string (e.g., a chromosomal region). It is unclear what the examiner is alleging to be ‘the sub-strings’ as claimed.
…
In contrast, claim 1 of the instant invention recites that multiple sub-strings (e.g., sequence reads) overlap with a given genomic location, thus making it a common genomic location to all of those overlapping sub-strings in respective sets of sub-strings corresponding to respective tissues.
…
(c) … But, claim 1 recites that the sub-strings are already aligned (e.g., to a reference that occurs before storing).
Thus, D2 does not disclose aligning sub-strings to each other where the sub-strings have previously been aligned to another sequence, as recited by ‘aligning the first set of aligned sub-strings and the second set of aligned sub-strings through their respective genomic location information using the common genomic location.’” (pages 8-9, original bold, underlining added except for “all” in bold where underlining is original)

100. I note that some of the above comments attempt to interpret the phrases in claim 1 by providing examples of what each phrase in question may encompass. Such examples clearly cannot impart an interpretation that is limiting to the scope of the claim. In addition, as I already noted in regard to the sequence reads, the Applicant’s examples do not necessarily appear to be within the meaning of the phrases in question. In general, the Applicant’s comments on the interpretation appear to be more reflective of what the Applicant may have intended to claim as opposed to the actual wording used in the claim. For these reasons, I do not find that the Applicant’s comments on claim construction are helpful. It may well be the case that further submissions and/or evidence that might be provided by the Applicant during the continued examination could be of assistance in resolving the issues.
Claim interpretation and clarity – conclusion
101. As I have already noted, the claims are to be interpreted purposively on the basis of the actual wording chosen by the Applicant, as clarified when necessary by the body of the Specification. In my earlier discussions, I have identified a number of clarity issues with the claims as worded. I have also established that the body of the Specification does not appear to provide much assistance in resolving these issues.
102. While I accept that, based on the body of the Specification, it may arguably be possible to ascertain what might reasonably be expected to be the invention that the Applicant would intend to claim, as a matter of fact this is not what is actually claimed. I consider that any interpretation exercise, based on the body of the Specification, would be highly speculative and more akin to notional claim re-drafting rather than claim construction. Furthermore, I am unable to find any justification for a hypothetical proposition that a claim would always be clear as long as the body of the specification provides enough information to make it possible to derive what could potentially be the invention that the applicant would want to claim.
103. As an additional comment, it can be seen from my previous discussions that the wording of the claims appears somewhat detached from the disclosure in the body of the Specification outside the consistory statements. While such a conclusion may potentially lead to a negative finding with respect to the compliance with the requirement, under s 40(3), that the claims must be supported by matter disclosed in the specification, in light of the serious clarity issues casting significant doubt on the scope of the claims, I do not consider that it will be appropriate for me to decide on the issue of support at the present point in time.
104. I conclude that, for the reasons explained above, the Specification does not comply with s 40(3), because all claims as presently proposed to be amended are not clear.
THE EXAMINER’S OBJECTIONS
105. I note that the Examiner did not raise clarity as a formal objection, although some clarifications of his interpretation of the claims and the possible implications to clarity are present, e.g., in objection item 6 from Examination report No. 2:
“As a preliminary matter, the response proceeds on the basis that the proper interpretation of the claim is that the limitation of ‘hav[ing] a common genomic location’ is applicable between the individual substrings within the first and second sets. I have interpreted the claims as requiring the common genomic location as relating the first and second sets and instead that within those individual sets, the substrings may be aligned with respect to each other without containing any common positions. If both constructions are open then the claim appears to be prima facie unclear.” (underlining added)

106. The above quotation confirms that the proper interpretation of certain features of the claims was indeed part of the arguments exchange during examination. Despite that, it still appears to me that the Examiner tried to look beyond the imprecise wording of the claims and attempted to interpret the claims in a sensible way, probably on the basis of the disclosure of the Specification as a whole, thus the clarity issues that I have identified remained largely unaddressed.
107. In view of the serious clarity issues identified through my analysis earlier in this decision, I do not consider that it will be appropriate for me to decide on any of the matters raised in the Last Report. Given the uncertainties in the scope of the claims, any such decision would be bound to be conditional on some assumptions and, ultimately, not very useful in view of the future prosecution of the Application. In fact, I do not believe that any binding statements on other grounds of objection should be made until the clarity issues are resolved.
CONCLUSION
108. I have found that the claims as proposed to be amended are not clear. In light of this, I do not consider it appropriate to decide on the issues with respect to manner of manufacture, lack of unity, and inventive step that were raised in the Last Report.
109. In the circumstances, the examination of the Application should continue. I note that on page 15 of their written submissions, the Applicant requested:
“In the event that the delegate is minded to remit the application for further examination, we request the opportunity to make submissions on a suitable time period for putting the application in order for acceptance pursuant to regulations 13.4(1) and (3).”

However, I note that the Applicant could have provided any such submissions together with their written submissions.
110. As noted by the Applicant, the “suitable time period for putting the application in order for acceptance” is governed by the Regulations. Under reg 13.4(1)(g), this period is “3 months from the date the decision is made”. However, the Commissioner does have the discretionary power, under reg 13.4(3), to “substitute a period longer than 3 months, if the Commissioner is satisfied that acceptance of the patent request and complete specification should be postponed”. It is reasonable to think that the current state of the application and its prosecution history are relevant considerations in exercising such a discretionary power. In the circumstances of this case, given the degree of uncertainty regarding the presently claimed invention and taking into account the fact that clarity was not raised as a formal objection, I consider it fair and appropriate to use my discretionary power and to provide a period of 9 (nine) months from the date of this decision to gain acceptance of the patent request and complete specification in relation to the Application.
111. In view of the findings in this decision, as a first step of the continued examination, the Applicant should propose suitable amendments and/or provide convincing evidence in order to clarify the claimed invention and, on that basis, to address the outstanding Examiner’s objections.
Dr V. Z. Kolev
Delegate of the Commissioner of Patents