[Proteomics] Peptide Identification - DB Search

โ€ขProteomics

์ด๋ฒˆ ์‹œ๊ฐ„์—๋Š” โ€œPeptide identificationโ€ ์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•ด๋ณด์ž. Peptide identification์€ ์„œ์—ด ๋ฐํžˆ๊ธฐ๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

Outline

Peptide identification์€ ํฌ๊ฒŒ 3๊ฐ€์ง€ ๋‹จ๊ณ„๋กœ ๋‚˜๋‰˜์–ด ์ง„๋‹ค.

Tandem Mass Spectrometry โ†’ Peptide Sequencing โ†’ Database search

์ด ์ค‘, DB search์—์„œ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ด ๋˜๋Š” ๋…ผ๋ฌธ์€ Sequest์ด๋ฉฐ ์ถ”ํ›„ ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ ํฌ์ŠคํŒ…์—์„œ ๋‹ค์‹œ ํ•œ๋ฒˆ ์ด์•ผ๊ธฐ ํ•ด๋ณด๋„๋ก ํ•˜์ž.

Proteomic Data Analysis Pipeline

๋‹จ๋ฐฑ์งˆ ๋ฐ์ดํ„ฐ ๋ถ„์„์˜ ์ „์ฒด ํŒŒ์ดํ”„ ๋ผ์ธ์€ ์•„๋ž˜์˜ ์‚ฌ์ง„์„ ํ†ตํ•ด ์‰ฝ๊ฒŒ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋‹ค.

Untitled

๋จผ์ € ์ƒ˜ํ”Œ์„ ํ†ตํ•ด ๋‹จ๋ฐฑ์งˆ์„ ์ถ”์ถœํ•œ ๋‹ค์Œ, ํšจ์†Œ๋ฅผ ํ†ตํ•œ Protein digestion์„ ์ง„ํ–‰ํ•œ๋‹ค. ์ด ๊ณผ์ •์—์„œ Protein sequence๋Š” peptide sequence๋กœ ๋ฐ”๋€Œ๊ฒŒ ๋˜๋ฉฐ, ์ด์˜จํ™” ๊ณผ์ •์„ ํ†ตํ•ด Mass spectrometry (MS) ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜์—ฌ MS1, ์ฆ‰ precursor ์ด์˜จ์„ ์„ ๋ณ„ํ•˜๊ฒŒ ๋œ๋‹ค. ์ดํ›„, precursor ์ด์˜จ์„ Mass spectrometry๋ฅผ ํ†ตํ•ด MS2 spectra ์ •๋ณด๋ฅผ ์–ป๊ณ  (์ด๋ฅผ ์–ด๋– ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ๋ถ„์„ํ•˜๋А๋ƒ์— ๋”ฐ๋ผ ๋ถ„์„๋ฒ•์ด ๋‚˜๋‰œ๋‹ค.) ๋ถ„์„์„ ํ†ตํ•ด peptide์˜ ์ •๋ณด๋ฅผ ์–ป๊ณ , ์ตœ์ข…์ ์œผ๋กœ ๋ณธ ๋‹จ๋ฐฑ์งˆ์ด ๋ฌด์—‡์ธ๊ฐ€๋ฅผ ํŒ๋ณ„ํ•˜๊ฒŒ ๋œ๋‹ค.

๋‹จ๋ฐฑ์งˆ ๋ถ„์„ ๋ฐฉ๋ฒ•์„ ๋‹จ๊ณ„์— ๋”ฐ๋ผ ๋‚˜๋ˆ„์–ด ๋‹ค์‹œํ•œ๋ฒˆ ์ด์•ผ๊ธฐํ•ด๋ณด์ž. ์šฐ๋ฆฌ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ Protein sequence๋ฅผ ๋ถ„์„ํ•ด์•ผ ํ•œ๋‹ค.

Untitled 1

Generate Peptides using Specific Enzyme

Protein complex โ†’ Enzyme โ†’ Peptides

ํŠน์ • ํšจ์†Œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ protein sequence๋ฅผ digest ํ•˜์—ฌ peptide๋ฅผ ํ˜•์„ฑํ•œ๋‹ค. ๋ณธ ๊ทธ๋ฆผ์—์„œ๋Š” ์ ˆ๋‹จ ํšจ์†Œ๋กœ์„œ Trypsin์ด ์ž‘์šฉํ•˜์˜€๋‹ค. Trypsin์€ K์™€ R์˜ C-terminal์„ ์ ˆ๋‹จํ•œ๋‹ค๋Š” ํŠน์ง•์„ ๊ฐ–๊ณ  ์žˆ๋‹ค.

Untitled 2

Mass spectrum

Protein complex โ†’ Enzyme โ†’ Peptides โ†’ Mass spectrometry (MS) โ†’ MS1 spectra

์ดํ›„ ๊ฐ ํŽฉํƒ€์ด๋“œ์˜ ์งˆ๋Ÿ‰์„ ๋ถ„์„ํ•˜๊ฒŒ ๋œ๋‹ค. ์ด๋•Œ ์ธก์ •๋˜๋Š” ์งˆ๋Ÿ‰์€ m/z๋กœ (์งˆ๋Ÿ‰/์ „ํ•˜๋Ÿ‰์œผ๋กœ) x์ถ•์— ์œ„์น˜ํ•œ๋‹ค. ๋ฌด๊ฒŒ๊ฐ€ ์ ์€ ์ชฝ๋ถ€ํ„ฐ ํฐ ์ชฝ์œผ๋กœ ํ•˜์—ฌ ๊ทธ๋ž˜ํ”„ ๋‚˜ํƒ€๋‚ด๊ฒŒ ๋œ๋‹ค. y์ถ•์€ intensity๋กœ ์–ผ๋งˆ๋งŒํผ์˜ ํŽฉํƒ€์ด๋“œ๊ฐ€ ์กด์žฌํ•˜๋Š”์ง€๋ฅผ ๋ณด์—ฌ์ค€๋‹ค.

๋ณธ ๊ณผ์ •์€ ์•„๋ž˜์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, MS1์ด ๋ณธ ๊ณผ์ •์— ํ•ด๋‹นํ•œ๋‹ค. ๋ณธ ๊ณผ์ •์„ ํ†ตํ•ด protein์„ ๊ตฌ์„ฑํ•˜๋Š” peptide์˜ ์ข…๋ฅ˜์™€ ๊ทธ ํฌ๊ธฐ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

Untitled 3

Select one peak

Protein complex โ†’ Enzyme โ†’ Peptides โ†’ Mass spectrometry (MS) โ†’ MS1 spectra โ†’ Mass spectrometry (collision energy) โ†’ MS2 spectra

์ด ์ค‘ ํ•˜๋‚˜์˜ peak์„ ์„ ํƒํ•˜๊ฒŒ ๋˜๋ฉฐ, ์„ ํƒ๋œ peak(peptide)๋Š” precursor๋ผ๊ณ  ๋ถ€๋ฅด๋ฉฐ ์ดํ›„ ๊ณผ์ •์— ๊ณ„์† ์ฐธ์—ฌํ•˜๊ฒŒ ๋œ๋‹ค.

Untitled 4

Tandem Mass spectrum

Protein complex โ†’ Enzyme โ†’ Peptides โ†’ Mass spectrometry (MS) โ†’ MS1 spectra โ†’ Mass spectrometry (collision energy) โ†’ MS2 spectra

MS1์—์„œ ์„ ํƒ๋œ precursor๋Š” ๋น„ํ™œ์„ฑ๊ธฐ์ฒด์™€์˜ ์ถฉ๋Œ์„ ํ†ตํ•ด Energy๋ฅผ ๋ถ€์—ฌ ๋ฐ›๊ฒŒ ๋˜๋ฉฐ, ์ด์˜จ์˜ ํ˜•ํƒœ(precursor ion)๋กœ ์ชผ๊ฐœ์ง€๊ฒŒ ๋œ๋‹ค (Fragmentation). ์ด๋•Œ side-chain์€ ๊ทธ๋Œ€๋กœ ์žˆ๊ณ  backbone์ด ์ž˜ ๋Š์–ด์ง€๋Š”๋ฐ, ๊ทธ๋Ÿฌ๋‚˜ ์ด backbone์˜ ์–ด๋””๊ฐ€ ๊นจ์ง€๋Š”์ง€๋Š” ๋ชจ๋ฅธ๋‹ค.

Untitled 5

Protein complex โ†’ Enzyme โ†’ Peptides โ†’ Mass spectrometry (MS) โ†’ MS1 spectra โ†’ Mass spectrometry (collision energy) โ†’ MS2 spectra

backbone์˜ ์–ด๋А ๋ถ€๋ถ„์ด ๊นจ์ง€๋Š”์ง€ ๋ชจ๋ฅด๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋“  ๊ฒฝ์šฐ์˜ ์ˆ˜๋ฅผ ๊ณ ๋ คํ•œ๋‹ค. Precursor ์ด์˜จ์˜ charge์— ๋”ฐ๋ผ ์–ด๋А ํ•œ ์ชฝ์€ ์ด์˜จ์ด, ๋‹ค๋ฅธ ํ•œ ์ชฝ์€ ์ด์˜จ์˜ ํ˜•ํƒœ๊ฐ€ ์•„๋‹ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋‘˜ ๋‹ค ์ด์˜จ์ผ ์ˆ˜๋„ ์žˆ๋‹ค. ๋˜ํ•œ peptide sequence์— K๋‚˜ R์ด ์žˆ์œผ๋ฉด ์ด์˜จํ™”์— ์œ ๋ฆฌํ•˜๋‹ค. ์ด ๋•Œ๋ฌธ์— Trypsin์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ๋„ ์žˆ๋‹ค.

MS๋ถ€ํ„ฐ MS1 spetra ๋ถ„์„, MS2(collision energy)๊นŒ์ง€์˜ ๊ณผ์ •์„ Tandem Mass Spectrometry(MS/MS)๋ผ๊ณ  ๋ถ€๋ฅด๋ฉฐ ๋ช‡๊ฐ€์ง€ ํŠน์ง•๋“ค์ด ์กด์žฌํ•œ๋‹ค.

  • Tandem Mass Spectrometry (MS/MS) : mainly generates partial N- and C- terminal peptides.

    Fragmentation์ด ๋ฐœ์ƒํ•œ ๋ถ€๋ถ„์„ ๊ธฐ์ค€์œผ๋กœ ์•ž ๋ถ€๋ถ„์„ N-term ํ˜น์€ prefix๋ผ๊ณ  ๋ถ€๋ฅด๋ฉฐ, ๋’ท ๋ถ€๋ถ„์„ C-term ํ˜น์€ suffix๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค.

  • Spectrum consists of different ion types because peptides can be broken in several places.

    ์•ž์—์„œ ์ด๋ฏธ ์–ธ๊ธ‰ํ•œ ๋ฐ”์™€ ๊ฐ™์ด, backbone์˜ ์–ด๋””๊ฐ€ ๊นจ์ง€๋А๋ƒ, ์ฆ‰ ์–ด๋А ๊ณ ๋ฆฌ๊ฐ€ ์ž˜๋ฆด์ง€ ๋ชจ๋ฅธ๋‹ค. ์–ด๋А ์ชฝ์ด ์ด์˜จ์ด ๋ ์ง€ ๋ชจ๋ฅด๋ฉฐ, ์•„๋ฏธ๋…ธ์‚ฐ์˜ ์„œ์—ด์— ๋”ฐ๋ผ ์ž˜ ๊นจ์ง€๋Š” ๊ณณ์ด ์กด์žฌํ•˜์ง€๋งŒ, ์ด๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ๊ณ„์‚ฐ์‹์€ ๋”ฐ๋กœ ์กด์žฌํ•˜์ง€ ์•Š๋Š”๋‹ค.

  • Chemical noise often complicates the spectrum.

    ํ™”ํ•™์  noise๋Š” spectrum์„ ๋ณต์žกํ•˜๊ฒŒ ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฆ‰ ์ƒ๋‹นํžˆ ๋ฏผ๊ฐํ•˜๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ์ด๋Ÿฌํ•œ ๋…ธ์ด์ฆˆ๋Š” ์™ธ๋ถ€์˜ ๋…ธ์ด์ฆˆ์ผ ์ˆ˜ ์žˆ๋Š”๋ฐ, ๋Œ€ํ‘œ์ ์œผ๋กœ ์‹คํ—˜ํ•˜๋Š” ์‚ฌ๋žŒ์˜ ๋จธ๋ฆฌ์นด๋ฝ protein์ธ ์ผ€๋ผํ‹ด, ํ˜น์€ internal fragment๋ฅผ ์˜ˆ๋กœ ๋“ค ์ˆ˜ ์žˆ๋‹ค.

  • Represented in 2-D: mass/charge axis vs. intensity axis

    2D๋กœ ํ‘œํ˜„๋˜๋ฉฐ, MS1์—์„œ์™€ ๊ฐ™์ด mass/charge (m/zm/z) ์ถ•๊ณผ intensity ์ถ•์œผ๋กœ ๋‚˜ํƒ€๋‚œ๋‹ค.

๋˜ํ•œ ์•„๋ž˜์˜ ๊ทธ๋ฆผ์— ๋ณด์ด๋Š” ์ŠคํŽ™ํŠธ๋Ÿผ์—์„œ ๊ฐ ๋ถ€๋ถ„์˜ ์งˆ๋Ÿ‰์ฐจ์ด๋Š” ๊ฐ ์•„๋ฏธ๋…ธ์‚ฐ์˜ ์งˆ๋Ÿ‰์„ ์˜๋ฏธํ•œ๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ์˜ˆ์‹œ๋กœ ์ฃผ์–ด์ง„ ์ŠคํŽ™ํŠธ๋Ÿผ(์•„๋ž˜ ์‚ฌ์ง„์˜ ๋…ธ๋ž€์ƒ‰ ๋ฐ•์Šค)์€ ๋น„๊ต์  ๊น”๋”ํ•˜๊ณ  ์˜ˆ์œ ์ƒํƒœ์ด๋‚˜, ํ˜„์‹ค์—์„œ๋Š” ์ด๋ณด๋‹ค ๋” ์ง€์ €๋ถ„ํ•œ ์ŠคํŽ™ํŠธ๋Ÿผ์ด ๋‚˜์˜ค๊ฒŒ ๋œ๋‹ค.

Untitled 6

์œ„ ๊ทธ๋ฆผ์€ peptide์˜ prefix๋งŒ ํ‘œ์‹œํ•ด ๋†“์€ ๊ฒƒ์ด๋‹ค. Prefix๋ฅผ ์ญ‰ ๋”ฐ๋ผ ์ฝ์œผ๋ฉด forward, suffix๋ฅผ ์ญ‰ ๋”ฐ๋ผ ์ฝ์œผ๋ฉด reversed๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ๋Š” ์ด์ƒ์ ์ธ case๋กœ ๋ˆ„๊ฐ€ ๋ˆ„๊ตฐ์ง€ ์œ ์ถ”๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ˆ„๊ฐ€ prefix์ผ์ง€, suffix์ผ์ง€ ๋ชจ๋ฅด๊ธฐ ๋•Œ๋ฌธ์—, ๊ฐ€์ •์„ ํ†ตํ•ด ์œ ์ถ”ํ•˜๊ฒŒ ๋œ๋‹ค.

Protein complex โ†’ Enzyme โ†’ Peptides โ†’ Tandem Mass spectrometry (MS/MS) โ†’ MS2 spectra

Untitled 7

Tandem Mass Spectrometry์—์„œ๋Š” ๊ฒฐ๊ตญ ์งˆ๋Ÿ‰์ฐจ๋ฅผ ํ†ตํ•ด ๊ตฌ์„ฑ ์•„๋ฏธ๋…ธ์‚ฐ์„ ์œ ์ถ”๊ฐ€๋Šฅํ•˜๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด ์ด๋ ‡๊ฒŒ ์œ ์ถ”ํ•œ MS2 spectra๋ฅผ ์–ด๋– ํ•œ ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ๋ถ„์„ํ•˜๊ณ , identificationํ•˜๊ฒŒ ๋ ๊นŒ?

Peptide identification

Peptide idenfication์—๋Š” ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ด ์กด์žฌํ•œ๋‹ค.

  • Database search (Sequest)
  • de Novo interpretation (Sherenga)

DB search๊ฐ€ ๋Œ€์ฒด์ ์œผ๋กœ ์œ ์šฉํ•˜์ง€๋งŒ, ์กฐ๊ธˆ๋” ์‹ฌ๋„ ์žˆ๋Š” ๊ฒฝ์šฐ de Novo๋กœ ์ง„ํ–‰๋˜๊ฒŒ ๋œ๋‹ค. ๋‘ ๋ฐฉ๋ฒ•์˜ ์ฐจ์ด๋Š” DB์˜ ์œ ๋ฌด์ด๋‹ค. DB๋Š” reference์— ๋ถˆ๊ณผํ•˜๋ฉฐ, ํ•ญ์ฒด์™€ ๊ฐ™์ด ๋‹ค๋ฅธ ์–ด๋А ๋ˆ„๊ตฐ๊ฐ€์˜ ๊ณ ์œ ์˜ ๋‹จ๋ฐฑ์งˆ์„ ๋ถ„์„ํ•˜๋Š” ๊ฒฝ์šฐ์—๋Š” DB search๋Š” ๊ทธ ํšจ์šฉ์„ฑ์„ ๋ฐœํœ˜ํ•˜์ง€ ๋ชปํ•œ๋‹ค. ์ฆ‰, DB๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ, de Novo interpretation์ด ๋” ํšจ๊ณผ์ ์ด๋‹ค.

Untitled 8

Peptide identification

Peptide identification์˜ ๋ชฉํ‘œ๋Š” Find a peptide with maximal match between an experimental and theoretical spectrum. ์ฆ‰, ์ด๋ก ์ ์ธ(์•„์ฃผ ๊ธฐ๋ณธ์ ์ธ) ์ŠคํŽ™ํŠธ๋Ÿผ๊ณผ ์‹คํ—˜์ ์ธ ์ŠคํŽ™ํŠธ๋Ÿผ ์‚ฌ์ด์˜ ์ตœ๋Œ€ ์ผ์น˜ํ•˜๋Š” ํŽฉํƒ€์ด๋“œ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด๋‹ค.

Input์œผ๋กœ๋Š” 4๊ฐ€์ง€๊ฐ€ ๋“ค์–ด๊ฐ„๋‹ค.

  • S : experimental spectrum
  • โ–ณ : set of possible ion type
  • m : precursor m/zm/z
  • c : charge

โ–ณ(๋ธํƒ€)๋Š” ์ด์˜จ์ด ์–ด๋””์„œ ๊นจ์ง€๋Š”๊ฐ€๋ฅผ ๊ณ ๋ คํ•˜์—ฌ input์œผ๋กœ ๋“ค์–ด๊ฐ€๋ฉฐ, c(charge)์˜ ๊ฒฝ์šฐ ์ฃผ์–ด์ง€์ง€ ์•Š์œผ๋ฉด ๋ชจ๋“  ๊ฒฝ์šฐ์˜ ์ˆ˜๋ฅผ ๊ณ ๋ คํ•˜๊ฒŒ ๋œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ณดํ†ต MS1์—์„œ c์— ๋Œ€ํ•œ ๊ฐ’์„ ์ œ์‹œํ•ด์ค€๋‹ค.

Precursor m/zm/z๋Š” Precursor์˜ neutral mass์™€ chargeร—protoncharge \times proton mass์˜ ํ•ฉ์„ charge๋กœ ๋‚˜๋ˆ”์œผ๋กœ์„œ ๊ณ„์‚ฐํ•˜๊ฒŒ ๋œ๋‹ค.

image

์œ„ ์‹์—์„œ precursor m/z๋Š” ๊ด€์ฐฐ๊ฐ’์ด๋ฉฐ, charge * proton mass ๋ถ€๋ถ„์—์„œ charge๋กœ 1~3๊ฐ€์˜ ์ด์˜จ์ด ๋“ค์–ด๊ฐˆ ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ ์—ฌ๊ธฐ์„œ ๊ณ„์‚ฐ๋œ percursor์˜ neutral mass์™€ ๋น„์Šทํ•œ ๊ฐ’์˜ peptide๋ฅผ ์ฐพ๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค.

์ด์— ๋”ฐ๋ผ Output์€ "A peptide with mass M=((mโˆ’m(H+))ร—c)M = ((m - m(H^{+})) \times c), whose theoretical spectrum matches the experimental spectrum S best." ์ฆ‰ ์ด๋ก ์  ์ŠคํŽ™ํŠธ๋Ÿผ์ด ์‹คํ—˜์  ์ŠคํŽ™ํŠธ๋Ÿผ S์— ์ตœ๊ณ ๋กœ ์ผ์น˜ํ•˜๋Š” ์งˆ๋Ÿ‰ M์„ ๊ฐ–๋Š” ํŽฉํƒ€์ด๋“œ๊ฐ€ ๋‚˜์˜จ๋‹ค.
(์ˆ˜์‹์œผ๋กœ ํ‘œํ˜„ํ•˜๋ฉด: M=((mโˆ’m(H+))ร—c)M = ((m - m(H^{+})) \times c))

DB search์™€ De Novo์˜ ์ฐจ์ด๋ฅผ DB์˜ ์œ ๋ฌด๋ผ๊ณ  ํ–ˆ๋Š”๋ฐ, ๋‹ค์Œ ๊ทธ๋ฆผ์„ ๋ณด๋ฉด ์กฐ๊ธˆ ๋” ์ž์„ธํžˆ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋‹ค.

Untitled 9

์ฆ‰, DB search๋Š” ์ฃผ์–ด์ง„ DB๋ฅผ ๊ณ ๋ คํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์‹œ๊ฐ„๋ณต์žก๋„๊ฐ€ 10810^{8}๋กœ ๊ณ ์ •๋˜์–ด ์žˆ์ง€๋งŒ, de Novo์˜ ๊ฒฝ์šฐ ๋ชจ๋“  ๊ฒฝ์šฐ์˜ ์ˆ˜๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ์ตœ์ ์˜ ๊ฒฝ๋กœ๋ฅผ ์ฐพ๋Š” D.P.๋ฌธ์ œ์™€ ๊ฐ™๊ธฐ์— 20n20^{n}์˜ ์‹œ๊ฐ„๋ณต์žก๋„๋ฅผ ๊ฐ–๊ฒŒ ๋œ๋‹ค. ์ด๋•Œ ์งˆ๋Ÿ‰๋ถ„์„๊ธฐ๊ฐ€ ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋Š” ํŽฉํƒ€์ด๋“œ์˜ ๊ธธ์ด๋Š” ์•„๋ฏธ๋…ธ์‚ฐ 6๊ฐœ๋ถ€ํ„ฐ 50๊ฐœ๊ฐ€ ์—ฐ๊ฒฐ๋œ ๊ฒฝ์šฐ๊นŒ์ง€์ด๋‹ค.

DB search๋ฅผ ์ด์šฉํ•œ peptide identification์˜ ๋ชฉํ‘œ๋Š” Find a peptide from the database with maximal match between an experimental and a theoretical spectrum.์ฆ‰, ์ผ๋ฐ˜์ ์ธ peptide identification๊ณผ ๋™์ผํ•˜์ง€๋งŒ database๋กœ๋ถ€ํ„ฐ ํŽฉํƒ€์ด๋“œ๋ฅผ ์ฐพ๋Š”๋‹ค๋Š” ๊ฒƒ์—์„œ ์ฐจ์ด๊ฐ€ ์กด์žฌํ•œ๋‹ค.

์ด์— ๋”ฐ๋ผ input์—๋„ ํ•˜๋‚˜๊ฐ€ ๋” ์ถ”๊ฐ€๋œ 5๊ฐ€์ง€๊ฐ€ ๋“ค์–ด๊ฐ„๋‹ค.

  • S : experimental spectrum
  • P : database of peptides
  • โ–ณ : set of possible ion type
  • m : precursor m/zm/z
  • c : charge

์ด๋•Œ โ–ณ(๋ธํƒ€)๋Š” ์ด์˜จ ํƒ€์ž…์— ๋”ฐ๋ผ ๋‹ค ๋‹ค๋ฅด๋ฉฐ, (b, y)๊ฐ€ majorํ•˜๊ฒŒ ์ถœ๋ ฅ๋œ๋‹ค.

์ด์— ๋”ฐ๋ผ output ๋˜ํ•œ A peptide of mass M from the database whose theoretical spectrum matches the experimental spectrum S best. ์ฆ‰, DB๋กœ ๋ถ€ํ„ฐ ๊ตฌํ•ด์ง€๊ฒŒ ๋œ๋‹ค.

Untitled 10

Database

๊ทธ๋ ‡๋‹ค๋ฉด DB search์—์„œ ์‚ฌ์šฉ๋˜๋Š” database๋Š” ๋ฌด์—‡์ผ๊นŒ? ๋ณดํ†ต UniProtKB๋ฅผ ๋งŽ์ด ์‚ฌ์šฉํ•œ๋‹ค.

Untitled 11

UniProtKB๋Š” Swiss-Prot๊ณผ TrEMBL๊ตฌ์„ฑ ๋˜์–ด ์žˆ์œผ๋ฉฐ, ์ „๋ฌธ๊ฐ€๊ฐ€ ์ˆ˜๋™์ ์œผ๋กœ annotationํ•˜๊ณ  reviewํ•˜์˜€๋Š”๊ฐ€ ์•„๋‹ˆ๋ฉด ์ž๋™ํ™”์— ๋”ฐ๋ฅธ (Automatic) annotation์œผ๋กœ review๊ฐ€ ์•ˆ๋˜์—ˆ๋А๋ƒ์— ๋”ฐ๋ผ ๊ตฌ๋ถ„์ด ๋œ๋‹ค. ๋˜ํ•œ ๋ณธ DB๋Š” ํฌ๊ฒŒ Human, Bacteria, Virus ๋“ฑ 3๊ฐ€์ง€ category๋กœ ๋ถ„๋ฅ˜๋œ๋‹ค.

Database - protein

Protein DB๋Š” .fastaํฌ๋งท์œผ๋กœ ํŒŒ์ผ์ด ์ €์žฅ๋˜๋ฉฐ, ๋‹จ๋ฐฑ์งˆ ์‹œํ€€์Šค์™€ ๊ทธ๋“ค์˜ ํ—ค๋” ์ •๋ณด๊ฐ€ ๋ฆฌ์ŠคํŠธํ™” ๋˜์–ด ์ €์žฅ๋˜์–ด ์žˆ๋‹ค.

Untitled 12

Database - peptide (In silico digestion)

์œ„์™€ ๊ฐ™์€ ๋‹จ๋ฐฑ์งˆ ์‹œํ€€์Šค๊ฐ€ ๋“ค์–ด์˜ค๋ฉด, ํŠน์ • ์กฐ๊ฑด์— ๋งž์ถฐ์„œ digestion๊ณผ์ •์„ ๋ฐ˜์˜ํ•˜์—ฌ peptide sequence๊ฐ€ ๊ตฌํ•ด์ง„๋‹ค. ์ด ๊ณผ์ •์€ ์ฝ”๋“œ๋กœ๋„ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ ์ถ”ํ›„ ์—…๋ฐ์ดํŠธ ํ•  ํฌ์ŠคํŒ…์—์„œ ์ฝ”๋“œ์™€ ๊ด€๋ จํ•œ ๋ถ€๋ถ„์„ ๋‹ค๋ฃฐ ์˜ˆ์ •์ด๋‹ค.

๋‚ด์šฉ๋งŒ ์ด์•ผ๊ธฐํ•ด๋ณด์ž๋ฉด, input์œผ๋กœ protein sequence๊ฐ€ ๋“ค์–ด์˜ฌ ๋•Œ, parameter๋กœ ์ ˆ๋‹จ ํšจ์†Œ์™€ ์ ˆ๋‹จ ์œ„์น˜๋ฅผ ์ด์•ผ๊ธฐํ•˜๋Š” Enzyme rule๊ณผ Number of missed cleavage, **Enzymic site information(fully, semi, none)**์ด ์ฃผ์–ด์ง€๊ฒŒ ๋˜๋ฉฐ, output์œผ๋กœ peptide sequence์™€ ๊ฐ ํŒŒ๋ผ๋ฏธํ„ฐ์— ๋Œ€ํ•œ ์ •๋ณด๊ฐ€ ์ถœ๋ ฅ๋œ๋‹ค. ์ด๋Ÿฌํ•œ ๊ณผ์ •์€ ์•„๋ž˜์˜ ๊ทธ๋ฆผ์„ ํ†ตํ•ด์„œ ํ™•์ธ์ด ๊ฐ€๋Šฅํ•˜๋‹ค.

Untitled 13

Basics for theoretical spectrum generation

์‹œํ€€์‹ฑํ•œ ํŽฉํƒ€์ด๋“œ DB๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ด๋ก ์ ์ธ spectrum์„ ๋งŒ๋“œ๋Š” ๊ณผ์ •์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.

Glycine์„ ์˜ˆ์‹œ๋กœ ๋“ค์–ด ์‚ดํŽด๋ณด๋ฉด ์•„๋ž˜์˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™๋‹ค.

Untitled 14

Glycine(G)์˜ ๊ฒฝ์šฐ, ๊ธฐ๋ณธ์ ์ธ backbone์ธ C2H2NOC_{2}H_{2}NO์— R์œ„์น˜์— H๊ฐ€ ๋ถ™์€ ํ™”ํ•™ ๊ตฌ์กฐ๋ฅผ ๊ฐ–๋Š”๋‹ค. Free amino acid๋Š” G๊ฐ€ ๋‹จ๋…์œผ๋กœ ์กด์žฌํ•˜๋Š” ๊ฒฝ์šฐ๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, Amino acid residue๋Š” ํŽฉํƒ€์ด๋“œ์— G๊ฐ€ ๊ฒฐํ•ฉ๋˜์–ด ์žˆ์„ ๊ฒฝ์šฐ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์ด๋•Œ ์ „ํ•˜๋ฅผ ๋„์ง€ ์•Š๋Š”๋‹ค๋ฉด Neutral ์ƒํƒœ์— ์žˆ๋‹ค๊ณ  ๋งํ•  ์ˆ˜ ์žˆ๋‹ค. Monoisotopic mass๋Š” ๋‹จ์ผ ๋™์œ„์›์†Œ์˜ ์งˆ๋Ÿ‰์„ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ๋ถ„์ž ๋‚ด ๊ฐ ์›์ž์˜ ๊ฐ€์žฅ ํ’๋ถ€ํ•œ ์ž์—ฐ ๋ฐœ์ƒ ์•ˆ์ • ๋™์œ„์›์†Œ์˜ ์งˆ๋Ÿ‰์˜ ํ•ฉ์„ ์ทจํ•˜์—ฌ ๊ณ„์‚ฐ์ด ๋œ๋‹ค. (์ด ๋ถ€๋ถ„์— ๋Œ€ํ•œ ์„ค๋ช…์€ ๋‹ค์Œ๋‹ค์Œ ์‚ฌ์ง„์—์„œ ๋‹ค์‹œํ•œ๋ฒˆ ๋” ๋‹ค๋ฃฌ๋‹ค.)

Untitled 15

์œ„์˜ ์‚ฌ์ง„์„ ๋ณด๋ฉด, ๊ฐ ์•„๋ฏธ๋…ธ์‚ฐ์˜ residue๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ๋™์œ„์›์†Œ๋ฅผ ๊ณ ๋ คํ•œ Avg. mass์™€ residue mass์ธ Mono. mass๋„ ๊ฐ™์ด ํฌํ•จ๋˜์–ด ์žˆ๋‹ค.

์ด๋•Œ Cysteine(C)์™€ Methionine(M)์—์„œ๋Š” ์งˆ์†Œ(N) ๋Œ€์‹ ์— ํ™ฉ(S)์ด ํฌํ•จ๋˜์–ด ์žˆ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๊ทธ๋ฅผ ์ œ์™ธํ•œ ๋‚˜๋จธ์ง€ ์•„๋ฏธ๋…ธ์‚ฐ๋“ค์€ ํƒ„์†Œ(C)์™€ ์ˆ˜์†Œ(H), ์งˆ์†Œ(N)์œผ๋กœ ํ™”ํ•™์‹์ด ๊ตฌ์„ฑ๋˜์–ด ์žˆ์Œ ๋˜ํ•œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

๋‹ค์Œ ๊ทธ๋ฆผ์€ MS1 ๊ทธ๋ž˜ํ”„๋ฅผ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์ด๋‹ค. ๋™์œ„์›์†Œ์˜ ์กด์žฌ๋กœ ์ธํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ peak๋“ค์ด ์—ฌ๋Ÿฌ๋ฒˆ ๋‚˜ํƒ€๋‚˜๊ฒŒ ๋˜๋ฉฐ, ์ฒซ ๋ฒˆ์งธ peak์ด ์ œ์ผ ์ž‘์€ ์งˆ๋Ÿ‰์ž„์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ๋‘ ๋ฒˆ์งธ peak์€ +1Da, ์„ธ ๋ฒˆ์งธ peak์€ +2Da, ๋„ค ๋ฒˆ์งธ peak์€ +3Da์ผ ๊ฒฝ์šฐ๋ฅผ ๋‚˜ํƒ€๋‚ด๋ฉฐ, mono์™€ ๋‹ค์Œ peak์˜ mass ์‚ฌ์ด์˜ Da ์ฐจ์ด๋ฅผ ํ†ตํ•ด charge๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค.

Untitled 16

  • Monoisotopic mass is the mass determined using the masses of the most abundant isotopes.
  • Average mass is the abundance weighted mass of all isotopic components.

Aver. mass๋Š” ํƒ„์†Œ ๊ฐฏ์ˆ˜์— ๋”ฐ๋ผ ์งˆ๋Ÿ‰์ฐจ(0.x_ or 0._)๊ฐ€ ์ฆ๊ฐ€ํ•˜๊ฒŒ ๋จ์„ ์•Œ์•„๋‘์ž.

ํŽฉํƒ€์ด๋“œ๊ฐ€ fragmentation๋  ๋•Œ ์–ด๋””๊ฐ€ ์ž˜๋ฆฌ๋А๋ƒ์— ๋”ฐ๋ผ์„œ ๋ถ€๋ฅด๋Š” ์ด์˜จ์˜ ๋ช…์นญ์ด ๋‹ฌ๋ผ์ง„๋‹ค.

Untitled 17

์ž˜๋ฆฐ ๋ถ€๋ถ„์„ ๊ธฐ์ค€์œผ๋กœ ์ขŒ์ธก๊ณผ ์šฐ์ธก์€ (a-ion, x-ion), (b-ion, y-ion), (c-ion, z-ion)๊ณผ ๊ฐ™์ด ์Œ์„ ์ด๋ฃจ๋Š” ์ด์˜จ์˜ ํ˜•ํƒœ๋กœ ์กด์žฌํ•œ๋‹ค. ์ด๋•Œ ๊ฐ ์ด์˜จ์˜ ์•„๋ž˜ ์ฒจ์ž๋Š” residue์˜ ๊ฐฏ์ˆ˜๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ, ๋ณธ์ธ์€ C์— ๊ฒฐํ•ฉ๋œ R์˜ ๊ฐฏ์ˆ˜๋กœ ์ƒ๊ฐํ•˜๋ฉด ์ข‹๊ฒ ๋‹ค๋Š” ์ƒ๊ฐ์„ ํ–ˆ๋‹ค. ์œ„์— ์ œ์‹œ๋œ ๊ทธ๋ฆผ์—์„œ ๋ณด์ด๋“ฏ์ด, ๋นจ๊ฐ„ ์ ์„ ์˜ ๋ฐ•์Šค๊ฐ€ ์•„๋ฏธ๋…ธ์‚ฐ residue mass๋ฅผ ์˜๋ฏธํ•œ๋‹ค.

Mass of a neutral peptide๋Š” residue mass์˜ ํ•ฉ๊ณผ terminating group์˜ mass์˜ ํ•ฉ์œผ๋กœ ํ‘œํ˜„๋˜๋ฉฐ, ์ด๋•Œ, masses of the terminating groups๋Š” N-terminus์˜ H์™€ C-terminus์˜ OH๋ฅผ ์˜ˆ๋กœ ๋“ค ์ˆ˜ ์žˆ๋‹ค.

์•„๋ฏธ๋…ธ์‚ฐ์˜ residue mass list๋Š” ๋‹ค์Œ ๋งํฌ๋ฅผ ํ†ตํ•ด ๋ณด๋‹ค ์ž์„ธํžˆ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

E.g. PEP

PEP๋ฅผ ์˜ˆ์‹œ๋กœ ํ•˜์—ฌ mass๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , ๊ฐ ์ด์˜จ์˜ mass, ๊ฐ ์ด์˜จ์˜ ํ™”ํ•™ ๊ตฌ์กฐ์‹๋“ค, ์ŠคํŽ™ํŠธ๋Ÿผ์„ ๊ตฌํ•ด๋ณด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

image

Untitled 18

์ŠคํŽ™ํŠธ๋Ÿผ์„ ์‚ดํŽด๋ณด๋ฉด, y-ion์ด ์ƒ๋Œ€์ ์œผ๋กœ mass๊ฐ€ ํผ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ํŠน์ • fragment ion์˜ ์ด์˜จํ™” ํšจ์œจ์€ ํ™”ํ•™์  ํŠน์„ฑ, ์ „ํ•˜ ์ƒํƒœ ๋ฐ ์ด์˜จํ™”์— ์‚ฌ์šฉ๋˜๋Š” ์‹คํ—˜ ์กฐ๊ฑด์„ ๋น„๋กฏํ•œ ๋‹ค์–‘ํ•œ ์š”์ธ์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ๋‹ค. ๊ทธ ์ค‘, y-ion์€ ์ผ๋ฐ˜์ ์œผ๋กœ peptide fragmentation ์ค‘ ํ˜•์„ฑ๋˜๋Š” ๋ฐฉ์‹ ๋•Œ๋ฌธ์— b-ion๋ณด๋‹ค ์ƒ๋Œ€์ ์œผ๋กœ ๋ฌด๊ฒ๊ฒŒ ๋œ๋‹ค. ์ด์œ ๋ฅผ ํŒŒ์•…ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋จผ์ € y-ion๊ณผ b-ion์˜ ํ˜•์„ฑ๊ณผ์ •์„ ์ดํ•ดํ•ด์•ผ ํ•œ๋‹ค.

y-ion์€ ํŽฉํƒ€์ด๋“œ ๊ฒฐํ•ฉ์˜ C-terminal์„ ์•„๋ฏธ๋…ธ์‚ฐ์˜ ์ž”๊ธฐ๋กœ ์ ˆ๋‹จํ•˜์—ฌ ํ˜•์„ฑ์ด ๋˜๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ ํŽฉํƒ€์ด๋“œ์˜ N-terminal์„ ํฌํ•จํ•˜๋Š” fragment ion์ด ์ƒ์„ฑ๋œ๋‹ค. ์ด์™€๋Š” ๋Œ€์กฐ์ ์œผ๋กœ, b-ion์€ ์•„๋ฏธ๋…ธ์‚ฐ ์ž”๊ธฐ์— ๋Œ€ํ•œ ํŽฉํƒ€์ด๋“œ ๊ฒฐํ•ฉ N-terminal์„ ์ ˆ๋‹จํ•จ์œผ๋กœ์จ ํ˜•์„ฑ์ด ๋˜๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ ํŽฉํƒ€์ด๋“œ์˜ C-terminal์„ ํฌํ•จํ•˜๋Š” fragment ion์„ ์ƒ์„ฑํ•œ๋‹ค. C-terminal์—์„œ N-terminal๋กœ ์ด๋™ํ•จ์— ๋”ฐ๋ผ peptide backbone์˜ mass๊ฐ€ ์ฆ๊ฐ€ํ•˜๊ธฐ ๋•Œ๋ฌธ์— peptide์˜ N-terminal์„ ํฌํ•จํ•˜๋Š” y-ion์€ ์ผ๋ฐ˜์ ์œผ๋กœ C-terminal์„ ํฌํ•จํ•˜๋Š” b-ion๋ณด๋‹ค ๋ฌด๊ฑฐ์šด ๊ฒƒ์ด๋‹ค.

์ŠคํŽ™ํŠธ๋Ÿผ์—์„œ ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ fragment ion์˜ ์ƒ๋Œ€์  ์กด์žฌ๋น„๋Š” ์‚ฌ์šฉ๋œ ํŠน์ • fragmentation์˜ ๋ฐฉ๋ฒ•, peptide ์„œ์—ด ๋ฐ ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ ์ด์˜จ์˜ ionization ํšจ์œจ์„ ๋น„๋กํ•œ ๋‹ค์–‘ํ•œ ์š”์ธ์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ collision-induced dissociation(CID) fragmentation์— ์˜ํ•ด ์ƒ์„ฑ๋œ ์ŠคํŽ™ํŠธ๋Ÿผ์—์„œ y-ion์ด b-ion๋ณด๋‹ค ๋” ํ’๋ถ€ํ•œ ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค.

๋งŽ์€ ๊ฒฝ์šฐ์—์„œ Trypsin์„ ์ ˆ๋‹จํšจ์†Œ๋กœ ํ•˜์—ฌ ์‹œํ€€์Šค์˜ C-terminal๋ฅผ ์ ˆ๋‹จํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ ๊ฒฐ๊ณผ y์ด์˜จ์˜ ์ด์˜จํ™” ๊ฒฝ์šฐ์˜ ์ˆ˜๊ฐ€ ๋งŽ๋‹ค๊ณ  ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋‹ค.

Untitled 19

๊ฐ ํŽฉํƒ€์ด๋“œ์— ๋Œ€ํ•œ ์ŠคํŽ™ํŠธ๋Ÿผ์„ ๋ฏธ๋ฆฌ ๋งŒ๋“ค์–ด๋‘๋ฉด ๋„์›€์ด ๋˜์ง€ ์•Š๋Š”๊ฐ€๋ผ๋Š” ์ƒ๊ฐ์„ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋งค๋ฒˆ ๋น„๊ตํ•˜๋Š” DB๊ฐ€ ๋ฐ”๋€Œ๊ธฐ ๋•Œ๋ฌธ์— ๋ฏธ๋ฆฌ ๋งŒ๋“ค์–ด๋‘”๋‹ค๊ณ  ํ•˜๋”๋ผ๋„ ์“ธ๋ชจ๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋‹ค๋ฐ˜์ˆ˜์ด๋ฉฐ, ์ ์šฉ๋˜๋Š” parameter์— ๋”ฐ๋ผ ์ŠคํŽ™ํŠธ๋Ÿผ์€ ๋‹ค๋ฅด๊ฒŒ ๋‚˜ํƒ€๋‚œ๋‹ค.

์œ„์˜ ์ฒซ ๋ฒˆ์งธ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์ด ํŠน์ • ๋‹จ๋ฐฑ์งˆ ์‹œํ€€์Šค DB์—์„œ ๋‹จ๋ฐฑ์งˆ์„ ๋ถˆ๋Ÿฌ์˜ค๋ฉด, ์ ˆ๋‹จํšจ์†Œ์— ๋”ฐ๋ฅธ Fragment๋ฅผ ๊ตฌํ•˜๊ณ , ๊ทธ ์ดํ›„ MS/MS Spectrum์„ ๊ณ„์‚ฐํ•˜๊ฒŒ ๋˜๋ฉฐ ์ด๋ฅผ ํ†ตํ•ด ์‹คํ—˜๊ฐ’์— ์˜ํ•œ spectrum๊ณผ ๋น„๊ต๋ฅผ ํ•˜๊ฒŒ ๋œ๋‹ค.

๋‘ ๋ฒˆ์งธ ๊ทธ๋ฆผ์€ ์ตœ๊ทผ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ ์˜ ๋ฐœ์ „์œผ๋กœ ๋ณ€ํ™”๋œ ์ŠคํŽ™ํŠธ๋Ÿผ ์˜ˆ์ธก์˜ ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค€๋‹ค. PROSIT์˜ ๊ฒฝ์šฐ 550,000๊ฐœ์˜ tryptice peptides์™€ 2,100๋งŒ๊ฐœ์˜ high-quality tandem mass specta๋ฅผ ํ•™์Šต์‹œํ‚จ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ๋กœ, ๋ฏธ๋ฆฌ ๋ฐ์ดํ„ฐ๋ฅผ ํ•ฉ์„ฑ ํ›„ ๋น„๊ต๋ฅผ ํ†ตํ•ด ํ•™์Šตํ•˜์˜€๋‹ค๊ณ  ์•Œ๋ ค์ ธ์žˆ๋‹ค. ์ฆ‰, ํ•™์Šต์„ ์œ„ํ•ด dataset์„ ๋‹ค์‹œ ์ƒˆ๋กœ ๋งŒ๋“ค์—ˆ์œผ๋ฉฐ, b-ion๊ณผ y-ion์˜ ์ƒ๋Œ€์ ์ธ ์–‘์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์œ„์น˜๋ฅผ ์˜ˆ์ธกํ•œ๋‹ค.

prosit ๋˜ํ•œ ์ถ”ํ›„ paper review์—์„œ ๋‹ค๋ฃฐ ์˜ˆ์ •์ด๋‹ค.

Match between spectra

๊ทธ๋ ‡๋‹ค๋ฉด ์‹คํ—˜์— ์˜ํ•œ spectrum๊ณผ DB์—์„œ ๊ตฌํ•œ spectrum์˜ ๋น„๊ต๋Š” ์–ด๋–ป๊ฒŒ ์ด๋ฃจ์–ด์งˆ๊นŒ?

์•„๋ž˜์˜ ๊ทธ๋ฆผ์„ ๋ณด๋ฉด Query Spectrum, ์ฆ‰ ์‹คํ—˜ spectrum์ด ์ฃผ์–ด์ง€๋ฉด ์ด๋ฅผ Spectral Database์˜ ๊ฒฐ๊ณผ์™€ ๋น„๊ต๋ฅผ ํ†ตํ•ด ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•œ๋‹ค. ์ด๋•Œ ๋น„๊ตํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ๋‹ค์–‘ํ•˜๊ฒŒ ์กด์žฌํ•œ๋‹ค.

Untitled 20

Match between spectra - SPC

์ฒซ๋ฒˆ์งธ๋กœ SPC, Shared Peak Count์ด๋‹ค. SPC๋Š” ๋‘ ์ŠคํŽ™ํŠธ๋Ÿผ ์‚ฌ์ด์—์„œ ๊ณต์œ ๋œ(๋™์ผํ•œ) peaks(=masses)์˜ ๊ฐฏ์ˆ˜๋ฅผ ์„ธ๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. โ€˜# of ๊ณตํ†ต๋œ peakโ€™์ด๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

  • The match between two spectra is the number of masses (peaks) they share (Shared Peak Count of SPC)
  • In practice mass-spectrometrists use the weighted SPC that reflects intensities of the peaks
  • Match between experimental and theoretical spectra is defined similarly

Match between spectra - SEQUEST

๋‹ค์Œ์œผ๋กœ SEQUEST์ด๋‹ค. SEQUEST๋Š” paper๋กœ๋„ ๋‚˜์™”์œผ๋ฉฐ, Cross correlation๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค๋Š” ํŠน์ง•์„ ๊ฐ–๋Š”๋‹ค. Cross correlation์€ ๋‘ ๊ณ„์—ด์˜ ์œ ์‚ฌ์„ฑ์„ ๋‹ค๋ฅธ ๊ณ„์—ด์— ๋Œ€ํ•œ ํ•œ ์ชฝ ๋ณ€์œ„์˜ ํ•จ์ˆ˜๋กœ ๋‚˜ํƒ€๋‚ด๋Š” ์ธก๋„๋ฅผ ๋งํ•˜๋Š”๋ฐ, ์Œ ์ด ํ‘œํ˜„์€ ๋„ˆ๋ฌด ๋ฒˆ์—ญํ•œ ๋А๋‚Œ์ด ๋“ ๋‹ค.

์‰ฝ๊ฒŒ ๋งํ•˜์ž๋ฉด, ์œ ์‚ฌํ•œ ํŠน์ง•์„ ๊ฐ–๋„๋ก ๋‘ ์ŠคํŽ™ํŠธ๋Ÿผ์— ํ‘ธ๋ฆฌ์— ๋ณ€ํ™˜ ๋“ฑ๊ณผ ๊ฐ™์€ ๋ฐฉ๋ฒ•์œผ๋กœ ๋น„์Šทํ•˜๊ฒŒ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด ๋ถ€๋ถ„์ด SEQUEST ๋…ผ๋ฌธ์—์„œ์˜ ํ•ต์‹ฌ์ด์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ถ”ํ›„ paper review์—์„œ ์ž์„ธํžˆ ๋‹ค๋ฃจ์–ด๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค.

Untitled 21

Match between spectra - SEQUEST/Comet

SEQUEST๊ฐ€ ์ƒ์—…ํ™”๋จ์— ๋”ฐ๋ผ ์œ ์‚ฌํ•œ ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•˜์—ฌ ๋งŒ๋“  tool์ด ๋ฐ”๋กœ Comet์ด๋‹ค. ์—ฌ๊ธฐ์„œ๋„ Cross correlation์„ ํ†ตํ•ด์„œ ์ƒ๋Œ€์  ์ผ์น˜๋ฅผ ํ‰๊ฐ€ํ•˜๊ณ ์ž ํ•˜์˜€๋Š”๋ฐ, Auto correlation ๋ถ€๋ถ„์€ ๋ฐฐ๊ฒฝ ๋ถ€๋ถ„์ด๋ผ Cross correlation์— ๋น„ํ•ด ์ง์ ‘์ ์ธ ํ‰๊ฐ€๋ฅผ ํ•˜์ง€ ์•Š๋Š”๋‹ค. Sequest์™€ Comet ๋‘˜๋‹ค XCorr ์ ์ˆ˜๋ฅผ ํ†ตํ•ด ์ƒ๋Œ€์ ์ธ ๊ฐ’์œผ๋กœ match(์ผ์น˜)๋ฅผ ํ‰๊ฐ€ํ•˜๊ฒŒ ๋œ๋‹ค.

Untitled 22

Match between spectra - X!Tandem score

๋˜ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ X!Tandem score๊ฐ€ ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๊ด€์—ฌ๋˜๋Š” ์ ์ˆ˜๋กœ by-score์™€ Hyperscore๊ฐ€ ์žˆ๋‹ค. by-score๋Š” b- ํ˜น์€ y-ion์˜ ์ผ์น˜ peaks์˜ intensities์˜ ํ•ฉ์œผ๋กœ ๋‚˜ํƒ€๋‚˜๊ฒŒ ๋˜๋ฉฐ, Hyperscore๋Š” by-score์— y์ด์˜จ์˜ ๊ฐฏ์ˆ˜!์™€ b์ด์˜จ์˜ ๊ฐฏ์ˆ˜! ์˜ ๊ณฑ์œผ๋กœ ๋‚˜ํƒ€๋‚œ๋‹ค. ์•„๋ž˜์˜ ๊ทธ๋ฆผ์„ ํ†ตํ•ด ์‰ฝ๊ฒŒ ์ดํ•ดํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. ์ด๋Ÿฌํ•œ ์ ์ˆ˜๋ฅผ ๊ณต์‹ํ™” ํ•˜๋ฉด ์•„๋ž˜ ์˜ค๋ฅธ์ชฝ ๊ทธ๋ฆผ๊ณผ ๊ฐ™๋‹ค.

Untitled 23

์ด๋Ÿฌํ•œ ์ ์ˆ˜๋“ค์„ ํ†ตํ•ด ์•„๋ž˜์™€ ๊ฐ™์ด โ€˜Hyperscore-axisโ€™์™€ โ€˜# of Matches -axisโ€™ ๊ทธ๋ž˜ํ”„๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ ๊ทธ๋ž˜ํ”„์—์„œ๋Š” ๋น„์„ ํ˜•์ ์ธ ์–‘์ƒ์„ ๋ณด์—ฌ best hot์„ ๊ณ„์‚ฐํ•˜๊ธฐ ์–ด๋ ต์ง€๋งŒ, ๋‘ ๋ฒˆ์งธ ๊ทธ๋ž˜ํ”„์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋“ฏ์ด โ€˜# of Matches -axisโ€™์— log๋ฅผ ์ทจํ•˜์—ฌ ์ฃผ๋ฏ€๋กœ์จ ์„ ํ˜•์„ฑ์„ ๋„๋„๋ก ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค. ์ดํ›„ Best hit์— ํ•ด๋‹นํ•˜๋Š” ๋ถ€๋ถ„์„ ๊ตฌํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค.

Untitled 24

๋‚ด์šฉ ์š”์•ฝ ๋ฐ ์ •๋ฆฌ

์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ํŽฉํƒ€์ด๋“œ์˜ ์„œ์—ด์„ ๋ฐํžˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ค‘์‹ฌ์œผ๋กœ ํ•˜์—ฌ Tandem Mass Spectrometry[MS/MS]์™€ peptide sequencing, Database search ๊ธฐ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์•˜๋‹ค.
ํŠนํžˆ SEQUEST์˜ ๊ฒฝ์šฐ DB search์˜ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ด, ๊ธฐ์ดˆ๊ฐ€ ๋˜๋Š” ๋…ผ๋ฌธ์œผ๋กœ ์ถ”ํ›„ paper review๋กœ ์—…๋กœ๋“œ ํ•  ์˜ˆ์ •์ด๋ฉฐ, Sequest๊ฐ€ ์ƒ์—…์ ์ธ ํˆด์ด๋ผ๋ฉด, X!Tandem์€ publicํ•œ ํˆด๋กœ์„œ ๊ทธ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณผ ์ˆ˜ ์žˆ์—ˆ๋‹ค.


๋ณธ ๋‚ด์šฉ์€ ํ•œ์–‘๋Œ€ํ•™๊ต ์ปดํ“จํ„ฐ์†Œํ”„ํŠธ์›จ์–ดํ•™๊ณผ ๋ฐ ์ธ๊ณต์ง€๋Šฅํ•™๊ณผ ๋ฐฑ์€์˜ฅ ๊ต์ˆ˜๋‹˜์˜ ๊ฐ•์˜์ž๋ฃŒ์„ ๋ฐ”ํƒ•์œผ๋กœํ•˜์—ฌ ์ž‘์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค!

PS. ์ถ”๊ฐ€ ๋ฌธ์˜์‚ฌํ•ญ ๋ฐ ์งˆ๋ฌธ์€ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฅผ ํ†ตํ•ด ์ €๋„ ๋” ์„ฑ์žฅํ•  ์ˆ˜ ์žˆ์„ํ…Œ๋‹ˆ๊นŒ์š”. :)

Share