Tables
erDiagram
Samples {
sample_id int PK
dna_sequence varchar
species varchar
}
Mermaid
복사
Question
Biologists are studying basic patterns in DNA sequences. Write a solution to identify sample_id with the following patterns:
•
Sequences that start with ATG (a common start codon)
•
Sequences that end with either TAA, TAG, or TGA (stop codons)
•
Sequences containing the motif ATAT (a simple repeated pattern)
•
Sequences that have at least 3 consecutive G (like GGG or GGGG)
Return the result table ordered by sample_id in ascending order.
DataFlow
•
like ATG%
•
like %(TAA, TAG, TGA)
•
IN ATAT
•
IN GGG
Answer
SELECT
sample_id,
dna_sequence,
species,
CASE WHEN LEFT(dna_sequence, 3) = 'ATG' THEN 1 ELSE 0 END AS has_start,
CASE WHEN RIGHT(dna_sequence, 3) IN ('TAA', 'TAG', 'TGA') THEN 1 ELSE 0 END AS has_stop,
CASE WHEN INSTR(dna_sequence, 'ATAT') != 0 THEN 1 ELSE 0 END AS has_atat,
CASE WHEN INSTR(dna_sequence, 'GGG') != 0 THEN 1 ELSE 0 END AS has_ggg
FROM Samples
ORDER BY sample_id ASC;
SQL
복사
SELECT
sample_id,
dna_sequence,
species,
(LEFT(dna_sequence, 3) = 'ATG') AS has_start,
(RIGHT(dna_sequence, 3) IN ('TAA', 'TAG', 'TGA')) AS has_stop,
(INSTR(dna_sequence, 'ATAT') != 0) AS has_atat,
(INSTR(dna_sequence, 'GGG') != 0) AS has_ggg
FROM Samples
ORDER BY sample_id ASC;
SQL
복사
SELECT
*,
dna_sequence REGEXP '^ATG' AS has_start,
dna_sequence REGEXP 'TAA$|TAG$|TGA$' AS has_stop,
dna_sequence REGEXP 'ATAT' AS has_atat,
dna_sequence REGEXP 'GGG' AS has_ggg
FROM samples
ORDER BY sample_id ASC;
SQL
복사